You're thinking from the point of view of a display program that needs to split ...

makecheck · on April 29, 2012

If a file really is pure ASCII, leave it that way. I am not suggesting to do otherwise. If a 30-year-old program only deals with ASCII then make sure your input looks like ASCII.

But if your input could contain complex UTF-8 (e.g. it's multi-language or whatever), you're not doing any favors by hiding this fact. The BOM is a quick way to know exactly what the file is, and it shows you that your program won't work with that input. So you translate the input or you fix the program.

At some point in the future the majority of programs will handle even complex UTF-8 properly, and then the BOM will be pointless because virtually all inputs will be UTF-8.