W3C home > Mailing lists > Public > whatwg@whatwg.org > September 2009

[whatwg] U+FEFF (BOM) stripping in UTF-16BE and UTF-16LE

From: Řistein E. Andersen <liszt@coq.no>
Date: Wed, 9 Sep 2009 00:09:09 +0100
Message-ID: <4CFD4B9D-45C5-4DCA-8827-11CE1C43C261@coq.no>
? 9.2.2.2 "Preprocessing the input stream" requires that a leading U 
+FEFF (byte order mark) be stripped irrespective of encoding, contra  
Unicode, which says that a leading U+FEFF is part of the document when  
the byte order is already established by other means.  This is  
probably harmless and potentially useful to deal with bislabelled  
documents, but it might be worth adding an explanatory note.

-- 
?istein E. Andersen
Received on Tuesday, 8 September 2009 16:09:09 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:52 UTC