- From: Biron,Paul V <Paul.V.Biron@KP.ORG>
- Date: Tue, 5 Dec 2000 11:06:34 -0800
- To: "'James Clark'" <jjc@jclark.com>, www-xml-schema-comments@w3.org
> -----Original Message-----
> From: James Clark [SMTP:jjc@jclark.com]
> Sent: Tuesday, December 05, 2000 4:50 AM
> To: www-xml-schema-comments@w3.org
> Subject: Re: Regex comments
>
> > From: James Clark (jjc@jclark.com)
> > Date: Tue, Dec 05 2000
> >
> > which suggests that a character class subtraction looks like:
> >
> > [abc-[def]]
> >
> > If this is right, it's deeply confusing that the description of \w uses
> > an incompatible syntax: [...]-[...]. It is also a pretty bizarre
> > feature: is this really necessary? I couldn't find any mention of it in
> > the Regexp documentation I consulted.
>
> I found it in UTR#18, so I withdraw this comment. (The comment about the
> description of \w still stands.)
>
Good, glad you found it, 'cause I was going to refer you to Unicode
Technical Report #18, Unicde Regular Expression Guidelines, section 2.3 [1].
And yes, there is a typo in the description of \w and I will change that, so
instead of:
\w [#x0000-#x10FFFF]-[\p{P}\p{S}\p{C}]
it will read
\w [#x0000-#x10FFFF-[\p{P}\p{S}\p{C}]]
pvb
References
[1] http://www.unicode.org/unicode/reports/tr18/#Subtraction
Received on Tuesday, 5 December 2000 14:22:23 UTC