On 29 Feb 2008, at 13:38, Brian Smith wrote: > > Ian Hickson wrote: >>> However, when the encoding is UTF-16LE or UTF-16BE (i.e. >>> supposed to be signatureless), do we really want to drop >>> the BOM silently? Shouldn't it count as a character that >>> is in error? >> >> Do the UTF-16LE and UTF-16BE specs make a leading BOM an error? >> >> If yes, then we don't have to say anything, it's already an error. >> >> If not, what's the advantage of complaining about the BOM in >> this case? > > See http://unicode.org/faq/utf_bom.html#28: > > "In particular, whenever a data stream is declared to be UTF-16BE, > UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used." > > If somebody wants to include a zero-width non-breaking space > (ZWNBSP) at the beginning of a stream, they have to use U+2060 WORD > JOINER instead. Could you possibly give me a pointer to something in the Unicode standard that requires that? I've never seen such a requirement. -- Geoffrey Sneddon <http://gsnedders.com/>Received on Friday, 29 February 2008 16:35:19 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 July 2008 08:10:25 GMT