Mark Needham

Thoughts on Software Development

Mac OS X: Removing Byte Order Mark with an editor

with one comment

About a month ago I wrote about some problems I was having working with Windows generated CSV files which had a Byte Order Mark (BOM) at the beginning of the file and I described a way to get rid of it using awk.

It’s a bit of a long winded process though and I always forget what the parameters I need to pass to awk are so I thought it would probably be quicker if I could just work out a way to get rid of the BOM using an editor.

I’m using a Mac so the most popular hex editor on that platform seems to be HexFiend.

If we open the problematic file with that it’s reasonably easy to see where the BOM is and we can then manually remove it.

Bom

There is a list of other hex editors for the Mac on this Stack Overflow thread.

I figured there was probably a way to do this using emacs and indeed there is!

One way is to open the file using ‘Meta-X find-file-literally’ which displays all non ASCII characters so that you can delete them if you want:

Bom 2

I also learnt about another way which is to first open the file using ‘Ctrl X Ctrl F’ and then run ‘Meta-X set-buffer-file-encoding-system’ and enter ‘utf-8′ before saving the file. The BOM will now be deleted!

This is perhaps a bit simpler since you don’t need to delete the characters manually.

There is a third way where you open the file using ‘Meta X hexl-find-file’ but it seems more difficult to use than the other two options!

Written by Mark Needham

October 7th, 2012 at 10:43 am

Posted in Software Development

Tagged with

  • http://twitter.com/hancengiz Cengiz Han

    a couple months ago we were trying to upload a CSV file to google big data and file was located on a linux server on AWS. We found this VIM command to solve the problem. :set nobomb