- From: Misha Wolf <Misha.Wolf@reuters.com>
- Date: Fri, 11 May 2001 20:05:16 +0100
- To: Mark Davis <mark@macchiato.com>
- Cc: ietf-charsets@iana.org, w3c-i18n-ig@w3.org
Has anyone looked to see how this ties in with: Extensible Markup Language (XML) 1.0 (Second Edition) Autodetection of Character Encodings (Non-Normative) http://www.w3.org/TR/REC-xml#sec-guessing Misha On 11/05/2001 15:29:45 Mark Davis wrote: > Charset aliases: > > NONE > > Suitability for use in MIME text: > > NO > > Published specification(s): > > http://www.unicode.org/unicode/reports/tr19/ > > The IETF registration imposes one additional constraint: if there is no > initial BOM then the byte-orientation must be big-endian. That is, in any > stream that does not begin with the (hex) byte sequence <00 00 FE FF> all of > the bytes are interpreted as big-endian. > > Note: This is parallel to the IETF registration of UTF-16. As defined by the > Unicode Standard Version 3.1, without a BOM the byte orientation of UTF-32 > and UTF-16 could be either little-endian or big-endian. The choice of byte > orientation would be determined by a higher-level protocol. The IETF > registration is such a protocol, and constrains the byte orientation to be > big-endian for determinant interpretation. > > > ISO 10646 equivalency table: > > Also in http://www.unicode.org/unicode/reports/tr19/ > > Additional information: > > Mark Davis > 2509 Alpine Road, Menlo Park, CA 94025 > mark@unicode.org > > Intended usage: > LIMITED USE > > > ] > ----------------------------------------------------------------- Visit our Internet site at http://www.reuters.com Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
Received on Friday, 11 May 2001 15:39:46 UTC