- From: Geoffrey Sneddon <foolistbar@googlemail.com>
- Date: Fri, 29 Feb 2008 16:34:58 +0000
- To: Brian Smith <brian@briansmith.org>
- Cc: www-archive@w3.org
On 29 Feb 2008, at 13:38, Brian Smith wrote: > > Ian Hickson wrote: >>> However, when the encoding is UTF-16LE or UTF-16BE (i.e. >>> supposed to be signatureless), do we really want to drop >>> the BOM silently? Shouldn't it count as a character that >>> is in error? >> >> Do the UTF-16LE and UTF-16BE specs make a leading BOM an error? >> >> If yes, then we don't have to say anything, it's already an error. >> >> If not, what's the advantage of complaining about the BOM in >> this case? > > See http://unicode.org/faq/utf_bom.html#28: > > "In particular, whenever a data stream is declared to be UTF-16BE, > UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used." > > If somebody wants to include a zero-width non-breaking space > (ZWNBSP) at the beginning of a stream, they have to use U+2060 WORD > JOINER instead. Could you possibly give me a pointer to something in the Unicode standard that requires that? I've never seen such a requirement. -- Geoffrey Sneddon <http://gsnedders.com/>
Received on Friday, 29 February 2008 16:35:19 UTC