W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > October to December 2000

RE: \i and \I in CR-xmlschema-2-20001024

From: Tony Graham <tgraham@mulberrytech.com>
Date: Tue, 5 Dec 2000 16:59:05 -0400 (EST)
Message-ID: <14893.22297.995000.332668@menteith.com>
To: www-xml-schema-comments@w3.org
At 5 Dec 2000 11:53 -0800, Biron,Paul V wrote:
 > \i is an XML name start (initial) character.  For further info, see my
 > response to James' message on this list of this morning [1].
...
 > [1]
 > http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000OctDec/0383.
 > html

Equating the [\p{L}\p{Nl}:_] expansion of \i to XML name start
characters is bogus.

1. The definition of \i doesn't mention a version of the Unicode
   Standard, but XML 1.0 names are tied to Unicode 2.0.  Unicode 3.0
   added many more characters that match \p{L} or \p{Nl} but that are
   not allowed in XML 1.0 names.

2. \p{L} encompasses \p{Lm}, for Modifier Letters.  Lm characters
   (i.e., characters with the value 'Lm' in the General Category field
   in the Unicode Character Database) are generally allowed as XML 1.0
   name characters but not as name start characters.

   Some characters with 'Lm' in the General Category field are allowed
   as XML 1.0 names because they are listed as alphabetic in
   proplist.txt.

3. XML 1.0 excludes characters in the compatibility area (from #xF900
   to #xFFFE) from XML names.  Any regular expression matching name
   start characters would have to exclude that code point range.

4. Characters with a font or compatibility decomposition are not
   allowed in XML 1.0 names, and the regular expression syntax does
   not cover matching on decompositions, so the current \i will match
   many characters, e.g., #x0132 and #x0133, that are not allowed in
   XML 1.0 names

It would be more honest to describe the expansion of \i as "XML 1.0
Letter characters or _ or :' than to give an inaccurate regular
expression.

Regards,


Tony Graham
======================================================================
Tony Graham                            mailto:tgraham@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9632
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
Received on Tuesday, 5 December 2000 17:03:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:49 GMT