- From: Martin Bryan <mtbryan@sgml.u-net.com>
- Date: Fri, 25 Oct 1996 17:12:50 +0100
- To: lee@sq.com, w3c-sgml-wg@w3.org
At 12:58 AM 25/10/96 EDT, lee@sq.com wrote: >Case insensitivity is not well defined for Unicode/ISO 10646 as a whole >and really only makes sense when you have a specific language -- but >cross references from a French section to a Swedish section of a >document might then have different rules for case sensitivity >(whether accents are retaind in upper case, for example). This is one of the downsides of allowing ISO 10646 for attribute and element names that has to be faced: either you make naming conventions locale specific or you drop case matching. I feel that the advantages that XML could offer by providing for a full range of 10646 characters for naming, etc, more than outway the disadvantage of having to be case sensitive in element and attribute names, etc. For the ERN extensions for SGML another pair of classes, NAMESTRT and NAMECHAR, was introduced for languages where there is no equivalence between uppercase and lowercase (e.g. CJK languages). Whilst this covers a large number of languages it does not cover the Quebec/France variants lee mentioned. This example is one reason why the concept of document specific case rules becomes important in multilingual document sets. Swedisn and French, for example, should not be in the same "document" but should be separate linked segments of a document with their own character set case rules. The question is how do you link them together. In SGML they cannot be subdocuments of a master document because subdocs must share an SGML declaration with the calling document. This is one of the "unnecessary" restrictions that SGML97 should address. ---- Martin Bryan, The SGML Centre, Churchdown, Glos. GL3 2PU, UK Phone/Fax: +44 1452 714029 WWW home page: http://www.u-net.com/~sgml/
Received on Friday, 25 October 1996 12:15:40 UTC