- From: <mark.davis@us.ibm.com>
- Date: Mon, 10 Apr 2000 17:24:12 -0600
- To: John Cowan <jcowan@reutershealth.com>
- cc: MURATA Makoto <muraw3c@attglobal.net>, Rick Jelliffe <ricko@gate.sinica.edu.tw>, xml-editor@w3.org, w3c-i18n-ig@w3.org, w3c-xml-core-wg@w3.org
People are not trying to throw away the BOM. The BOM is extremely useful for plain, untagged text, where there is no indication of the character encoding. However, there are many circumstances where the BOM is inappropriate, and one has the mechanism for explicitly declaring the character encoding. The UTC (and that RFC) use the terms "UTF-16BE" and "UTF-16LE" for those circumstances. There are some guidelines in http://www.unicode.org/unicode/faq/#BOM Mark ___ Mark Davis, IBM Center for Java Technology, Cupertino (408) 777-5850 [fax: 5891], mark.davis@us.ibm.com, president@unicode.org http://maps.yahoo.com/py/maps.py?Pyt=Tmap&addr=10275+N.+De+Anza&csz=95014 John Cowan <jcowan@reutershealth.com>@w3.org on 2000.04.06 09:31:07 Sent by: w3c-i18n-wg-request@w3.org To: MURATA Makoto <muraw3c@attglobal.net> cc: Rick Jelliffe <ricko@gate.sinica.edu.tw>, xml-editor@w3.org, w3c-i18n-ig@w3.org, w3c-xml-core-wg@w3.org Subject: Re: I18N issues with the XML Specification MURATA Makoto wrote: > RFC 2871 is already an RFC. In my understanging, people are > trying to throw away the BOM by introducing charset names "utf-16le" > and "utf-16be". Some people have already thrown away the BOM. RFC 2871 introduces names for the results of doing so. > If the handling of UTF-16LE/UTF-16BE is mandatory, the XML processor > is required to handle new octet sequences. I do not think all exising > processors can handle "<?xml encoding="UTF-16LE"?>" in UTF-16LE. No processor can be required to handle UTF-16LE/BE, only UTF-16 (and UTF-8). -- Schlingt dreifach einen Kreis um dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)
Received on Monday, 10 April 2000 19:24:29 UTC