- From: Tony Graham <tgraham@mulberrytech.com>
- Date: Tue, 5 Dec 2000 16:59:05 -0400 (EST)
- To: www-xml-schema-comments@w3.org
At 5 Dec 2000 11:53 -0800, Biron,Paul V wrote: > \i is an XML name start (initial) character. For further info, see my > response to James' message on this list of this morning [1]. ... > [1] > http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000OctDec/0383. > html Equating the [\p{L}\p{Nl}:_] expansion of \i to XML name start characters is bogus. 1. The definition of \i doesn't mention a version of the Unicode Standard, but XML 1.0 names are tied to Unicode 2.0. Unicode 3.0 added many more characters that match \p{L} or \p{Nl} but that are not allowed in XML 1.0 names. 2. \p{L} encompasses \p{Lm}, for Modifier Letters. Lm characters (i.e., characters with the value 'Lm' in the General Category field in the Unicode Character Database) are generally allowed as XML 1.0 name characters but not as name start characters. Some characters with 'Lm' in the General Category field are allowed as XML 1.0 names because they are listed as alphabetic in proplist.txt. 3. XML 1.0 excludes characters in the compatibility area (from #xF900 to #xFFFE) from XML names. Any regular expression matching name start characters would have to exclude that code point range. 4. Characters with a font or compatibility decomposition are not allowed in XML 1.0 names, and the regular expression syntax does not cover matching on decompositions, so the current \i will match many characters, e.g., #x0132 and #x0133, that are not allowed in XML 1.0 names It would be more honest to describe the expansion of \i as "XML 1.0 Letter characters or _ or :' than to give an inaccurate regular expression. Regards, Tony Graham ====================================================================== Tony Graham mailto:tgraham@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9632 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
Received on Tuesday, 5 December 2000 17:03:05 UTC