- From: Tim Bray <tbray@textuality.com>
- Date: Wed, 12 Apr 2000 09:31:02 -0700
- To: Dan Connolly <connolly@w3.org>
- Cc: "Martin J. Duerst" <duerst@w3.org>, w3c-i18n-ig@w3.org, xml-editor@w3.org, w3c-xml-core-wg@w3.org
At 10:39 AM 4/12/00 -0500, Dan Connolly wrote: >Is there any reason not to treat UTF-16BE and UTF-16LE just >like other non-required encodings, ala ISO-8859-1 >and ISO-2022-JP and such? i.e. you can use it, but not >without an explicit declaration (either in the XML entity >or in the HTTP headers or filesystem metadata or ...), and beware >that not all processors are required to read it; you may >well get a 'sorry, I don't grok that encoding' error. It all comes down to the interpretation of the term 'UTF-16' in the XML spec. If this is interpreted to subsume the LE and BE versions, then an XML processor would be justified in declaring an error. Thus, Martin wants essentially to forbid a processor from applying the spec's rules on UTF-16 to things that are in -BE and -LE. Note that the RFC's go further, and *forbid* the use of the BOM in -LE and -BE. It is my position that this is a mistake. First, that -LE and -BE are really truly UTF-16, and that pretending they're not is first of all just incorrect. Secondly, this is actively harmful in that it encourages people to create documents using a format that *forbids* the application of a simple low-cost interoperability tool that demonstrably works well across networks and implementations. This is simply wrong. -Tim
Received on Wednesday, 12 April 2000 12:29:58 UTC