W3C home > Mailing lists > Public > xml-editor@w3.org > April to June 2000

Re: UTF-16BL/LE,... (was: Re: I18N issues with the XML Specification

From: Paul Hoffman / IMC <phoffman@imc.org>
Date: Wed, 12 Apr 2000 16:26:28 -0700
Message-Id: <4.3.2.20000412162143.00c52a80@mail.imc.org>
To: Tim Bray <tbray@textuality.com>, John Cowan <cowan@locke.ccil.org>
Cc: duerst@w3.org, w3c-i18n-ig@w3.org, xml-editor@w3.org, w3c-xml-core-wg@w3.org
At 04:22 PM 4/12/00 -0700, Tim Bray wrote:
>Consider an author creating an XML document in an editor that happens to
>use UTF-16 and thus (correctly) inserts a BOM.  That document then cannot
>be transmitted as -BE or -LE, even by software that knows its byte
>ordering, because the BOM is forbidden in those variants.

Quite right. That document also then cannot be transmitted as UTF-8, 
ISO-2022-JP, or BIG5, either. UTF-16BE and UTF-16LE are charsets with rules 
just like other charsets.

>   Thus, as
>Murata has long (and correctly) stated, the -BE and -LE variants are
>simply not appropriate for XML documents. -Tim

These two charsets are not appropriate in XML documents that do not have 
the charset tagged. That's true of quite a number of charsets, yes?

--Paul Hoffman, Director
--Internet Mail Consortium
Received on Wednesday, 12 April 2000 19:28:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:30 GMT