- From: Erik van der Poel <erik@netscape.com>
- Date: Tue, 30 Jun 1998 09:27:49 -0700
- To: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>
- Cc: ietf-charsets@ISI.EDU
MURATA Makoto wrote: > > UTF-16 generators MUST send in big-endian byte order and must > begin with the zero width non breaking space (also called Byte > Order Mark or BOM) (0xFEFF). The 2nd "must" is in lower-case. Should it be upper-case? > Thus, an UTF-16 parser encountering the code 0xFFFE as the an UTF-16 -> a UTF-16 > If the BOM > is absent, there is no way to 100% reliably detect little-endian > data that does not use the BOM. the BOM is absent ... data that does not use the BOM (2x) > The Coded Character Set that UTF-16 refers to is the same version > of ISO/IEC 10646-1 and Unicode that the charset "UTF-8" refers to. We need a reference to the UTF-8 RFC 2279 at the end of the document. Erik --Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Tuesday, 30 June 1998 09:34:07 UTC