- From: Martin J. Duerst <duerst@w3.org>
- Date: Mon, 25 Oct 1999 16:17:09 +0900
- To: xml-editor@w3.org
I herewith submit the following errata reports. For further details, please see the thread starting at http://lists.w3.org/Archives/Member/w3c-i18n-ig/1999Oct/0136.html (w3c members only). Regards, Martin. > 1) In http://www.w3.org/TR/REC-xml#charencoding > > "All XML processors must be able to read entities in either UTF-8 or UTF-16." > > This might be interpreted so that it would be okay to only support UTF-8 > or to only support UTF-16. But I know that's not the way it was intended. > > A better wording would probably be: > > "All XML processors must be able to read both entities in UTF-8 and > entities in UTF-16." > > > 2) again in http://www.w3.org/TR/REC-xml#charencoding > > In an encoding declaration, the values "UTF-8", "UTF-16", > "ISO-10646-UCS-2", and "ISO-10646-UCS-4" should be used for the > various encodings and transformations of Unicode / ISO/IEC > 10646, the values "ISO-8859-1", "ISO-8859-2", ... "ISO-8859-9" > should be used for the parts of ISO 8859, and the values > "ISO-2022-JP", "Shift_JIS", and "EUC-JP" should be used for the > various encoded forms of JIS X-0208-1997. XML processors > may recognize other encodings; it is recommended that > character encodings registered (as charsets) with the Internet > Assigned Numbers Authority [IANA], other than those just > listed, should be referred to using their registered names. Note > that these registered names are defined to be case-insensitive, so > processors wishing to match against them should do so in a > case-insensitive way. > > There are several problems here: > > - Case sensitivity is defined for each single value, instead of for the > value in general. Would be better to say that the value in general > is case-insensitive. > - There is advice for how to label entities, but not for how to interpret > values. A parser that interprets "EUC-JP" as let's say "european > unified character set - joint code page" (there currently isn't such > a thing :-) would be fully conformant, although I'm not at all sure > that this was the intention when the XML spec was written, or that > this would be desirable. Two additions seem to be necessary: > - Say that all the values registered with IANA have to either be > interpreted in the way defined by IANA, or treated as unknown > (->error) > - Because we cannot predict what IANA will register in the future, > say that for anything not registered with IANA, the x- prefix > should be used. > This would bring things in line e.g. with the most recent wording > in XSLT. [http://www.w3.org/TR/xslt#output, see encoding] > > Can you follow up with this, or tell me it's already dealt with, > or tell me what I have to do? Many thanks in advance. > > > Martin. #-#-# Martin J. Du"rst, World Wide Web Consortium #-#-# mailto:duerst@w3.org http://www.w3.org
Received on Monday, 25 October 1999 03:21:34 UTC