W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > October to December 2000

Re: Regex comments

From: James Clark <jjc@jclark.com>
Date: Wed, 06 Dec 2000 08:48:24 +0700
Message-ID: <3A2D9AE8.6444005B@jclark.com>
To: www-xml-schema-comments@w3.org
"Biron,Paul V" wrote:
 
> (BTW, my reading of of production [84] from XML 1.0 equates "name start
> character" with [\p{L}\p{Nl}:_], which is how \i is defined.  Could it be
> that that is not the correct translation of name start character and hence,
> why you didn't realize that there was such an escape?)

That's not the correct translation.  Name start characters in XML don't
match up nicely to any Unicode categories (for example, compatability
characters and characters with a compatability decomposition are
disallowed).  The end of Appendix B of the XML Rec has a section
describing the relationship.  You can probably do it with a subtraction
from \c, but it would be hairy. Something like:

[\c-[-\.\p{M}\p{Lm}\p{Nd}]-[&#x02BB;-&#x02C1;&#x0559;&#x06E5;&#x06E6;]]

Simpler to describe it as the characters allowed as the first character
of an XML 1.0 _Name_ (eg _Letter_ or '_' or ':').

James
Received on Wednesday, 6 December 2000 05:48:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:49 GMT