- From: Michael Sperberg-McQueen <U35395@UICVM.CC.UIC.EDU>
- Date: Mon, 09 Sep 96 16:03:54 CDT
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Martin Bryan writes: >- the reference concrete syntax only permits the use of Latin >alphanumeric characters in names of elements, attributes and tokens: >should XML be designed to allow users to define elements, attributes >and their values in a form that is dependent on their local language, >or must they restrict themselves to shared names that have meanings >defined in English only? Here's my main concern with proposals to restrict XML to specific coded character sets (other than Unicode or UCS-4): we have a good chance here to provide a strong basis for internationalization, and requiring ISO 646, or any particular flavor of ISO 8859, or even *all* of the flavors of 8859, is not as good as defining XML from the ground up as language-neutral. We should take good care that XML is compatible with the proposals for internationalization (or, as they say in the trade, i18n) already formulated by the relevant W3C working group. Those are good proposals, and XML should harmonize with them. (It would be nice to have a list of what that might entail, though; any volunteers?) >- the default character set in 8879 matches that of the reference >concrete syntax: should users be able to select which character set >is most appropriate for their documents and specify an SGML >declaration in which only a subset of ISO 10646 is recognized as >valid while still retaining the reference concrete syntax for markup? This does not seem, offhand, to help much in keeping XML simple to understand and implement. Or am I too pessimistic? -C. M. Sperberg-McQueen ACH / ACL / ALLC Text Encoding Initiative University of Illinois at Chicago tei@uic.edu All opinions expressed in this note (except those I have quoted from other authors) are mine. They are not necessarily those of the Text Encoding Initiative, its executive committee or other participants, its sponsors, or its funders. Anyone who says otherwise is wrong.
Received on Monday, 9 September 1996 17:08:29 UTC