Leading BOM

The draft says:
"A leading U+FEFF BYTE ORDER MARK (BOM) must be dropped if present."

That's reasonable for UTF-8 when the encoding has been established by  
other means.

However, when the encoding is UTF-16LE or UTF-16BE (i.e. supposed to  
be signatureless), do we really want to drop the BOM silently?  
Shouldn't it count as a character that is in error?

Likewise, if an encoding signature BOM has been discarded and the  
first logical character of the stream is another BOM, shouldn't that  
also count as a character that is in error?

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Friday, 25 May 2007 21:33:09 UTC