W3C home > Mailing lists > Public > www-international@w3.org > April to June 2000

Re: BOM & Unicode editors

From: Asmus Freytag <asmusf@ix.netcom.com>
Date: Tue, 09 May 2000 17:14:14 -0700
Message-Id: <4.2.0.58.20000509170850.01df4710@popd.ix.netcom.com>
To: Saba Sundaramurthy <ssundaramurthy@verisign.com>, mozilla-i18n@mozilla.org, www-international@w3.org, i18n-prog@acoin.com
At 04:55 PM 5/9/00 -0700, Saba Sundaramurthy wrote:
>     Is this something all editors that save files in Unicode or UTF-8 are
>required to do? Can I depend on the presence of this marker in my code?

No, it's not a requirement, but it's a convention followed by quite a few 
tools,
because otherwise it's harder to use the same .txt extension for both ASCII and
Unicode (and also it helps to mark the byte order, of course).

I would recommend that you look for it in your code, if you plan to read UTF-16
files. At the minimum you need to be prepared for its presence. But you may
possibly encounter some un-marked UTF-16. There are some quite strong 
heuristics that one can follow to detect Unicode without a BOM, but a 
signature like this is more reliable.

A./
Received on Tuesday, 9 May 2000 20:10:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:55 GMT