W3C home > Mailing lists > Public > xml-editor@w3.org > April to June 2000

Re: I18N issues with the XML Specification

From: MURATA Makoto <muraw3c@attglobal.net>
Date: Thu, 06 Apr 2000 11:08:35 +0900
Message-Id: <200004060208.AA02207@t3knz.attglobal.net>
To: Rick Jelliffe <ricko@gate.sinica.edu.tw>
Cc: xml-editor@w3.org, w3c-i18n-ig@w3.org, w3c-xml-core-wg@w3.org
In message "Re: I18N issues with the XML Specification",
Rick Jelliffe wrote...
 >
 >I don't see why there is any need to ban the BOM for UTF16LE and
 >UTF16BE. RFC 2871 puts on an unnessary burdon here. But even if
 >it is banned, it does not make autodection unreliable.

RFC 2871 is already an RFC.  In my understanging, people are 
trying to throw away the BOM by introducing charset names "utf-16le" 
and "utf-16be".

 >As in my email responding to John Cowen, where did the WG get the idea
 >that an external parseable entity can begin with any character? 

This is a fact.

 >> If we decide to allow UTF-16LE/BE for XML, we have to publish 
 >> a new RFC that supersedes RFC 2376, and to publish a new version 
 >> of XML.  Then, the sentence should be deleted and the autodetection 
 >> algorithm should be significantly revised so as to handle 
 >> encoding declarations in UTF-16LE/BE correctly.
 >
 >Why?  It is just another encoding. Why cannot this be handled merely
 >by updating Appendix F?

Certainly, a change to Appendix F will do the job.  However, this 
is a significant change.

If the handling of UTF-16LE/UTF-16BE is mandatory, the XML processor 
is required to handle new octet sequences.  I do not think all exising 
processors can handle "<?xml encoding="UTF-16LE"?>" in UTF-16LE.

Cheers,




in UTF-16LE

----
MURATA Makoto  muraw3c@attglobal.net
Received on Wednesday, 5 April 2000 22:08:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:30 GMT