W3C home > Mailing lists > Public > xml-editor@w3.org > April to June 2000

Re: UTF-16BL/LE,... (was: Re: I18N issues with the XML

From: John Cowan <jcowan@reutershealth.com>
Date: Fri, 14 Apr 2000 11:56:39 -0400
Message-ID: <38F73FB7.156E1F94@reutershealth.com>
To: Rick Jelliffe <ricko@gate.sinica.edu.tw>
CC: xml-editor@w3.org, w3c-i18n-ig@w3.org, w3c-xml-core-wg@w3.org
Rick Jelliffe wrote:
> 
> On Wed, 12 Apr 100, John Cowan wrote:
> 
> > ... is there going to be a way to label those encodings properly, or not?
> > Prohibition just isn't a viable strategy: education (of the receiver,
> > who is free to reject the funny encoding) is.
> 
> If Johnny User decides to be ultra careful, and labels his UTF-16XX data
> with a BOM in the right order and with an encoding header that says the
> right thing, we must not disqualify that document just because some
> pre-XML RFC gives outdated rules.

Say what?  What RFC are you talking about?

The RFC defining UTF-16BE and UTF-16LE is dated February 2000, decidedly
post-XML.  A UTF-16BE file that begins with 0xFE 0xFF, or a UTF-16LE file
that begins with 0xFF 0xFE, is a non-XML file beginning with the Unicode
ZWNBSP character, which is not valid at the beginning of an XML document.
It would be valid at the beginning of an external parsed entity, but
that is irrelevant, because these encodings require an encoding declaration.

-- 

Schlingt dreifach einen Kreis um dies! || John Cowan <jcowan@reutershealth.com>
Schliesst euer Aug vor heiliger Schau,  || http://www.reutershealth.com
Denn er genoss vom Honig-Tau,           || http://www.ccil.org/~cowan
Und trank die Milch vom Paradies.            -- Coleridge (tr. Politzer)
Received on Friday, 14 April 2000 11:57:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:30 GMT