Skip navigation

While hunting down broken images in the pdf import, I came across a very weird behaviour, that was seemingly caused by plain C stream IO: every once in a while, streams that were explicitely opnened in binary mode converted 0x0A into 0x0D. One-on-one substitution, not the usual 0x0A to 0x0A 0x0D lineend conversion, notably. I wasted about a day trying to track that down, suspecting everything from crazy filesystem flags, to symbol hijacking during link time (i.e. somebody sneaking in own fwrite/putc/whatever into the binary). Laboriously ruled out one after the other.

Finally, I did what I should have done in the first place: I used a different hex editor. And lo and behold, the trusty midnight commander showed all the files with correct 0x0A. Caught in the act and proven guilty: emacs’ hexl-mode.

Well, kind of. It turned out to be a rather unlucky combination of things going wrong: emacs with its buffer encoding scheme, which tries to be smart and detect the actual encoding – if there’s a 0x0A in the file before a 0x0D, emacs assumes raw-text-unix and leaves line ends alone. If there’s a 0x0D in the file before any 0x0A, emancs assumes raw-text-mac, and converts all assumed lineends to 0x0D. And since hexl-mode seems to work on top of an existing buffer (and never loads the file directly from disk), this resulted in the described mess.

So, to save yourself some trouble: don’t trust your tools. And make sure to have files you want to work on with hexl-mode match something in file-coding-system-alist’s content, that enforces binary reads (elc, png, jpeg, ar, gz etc. for my setup).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: