RE: Regex comments

> -----Original Message-----
> From:	James Clark [SMTP:jjc@jclark.com]
> Sent:	Tuesday, December 05, 2000 4:50 AM
> To:	www-xml-schema-comments@w3.org
> Subject:	Re: Regex comments
> 
> > From: James Clark (jjc@jclark.com)
> > Date: Tue, Dec 05 2000
> > 
> > which suggests that a character class subtraction looks like:
> > 
> >  [abc-[def]]
> > 
> > If this is right, it's deeply confusing that the description of \w uses
> > an incompatible syntax: [...]-[...].  It is also a pretty bizarre
> > feature: is this really necessary? I couldn't find any mention of it in
> > the Regexp documentation I consulted.
> 
> I found it in UTR#18, so I withdraw this comment. (The comment about the
> description of \w still stands.)
> 
Good, glad you found it, 'cause I was going to refer you to Unicode
Technical Report #18, Unicde Regular Expression Guidelines, section 2.3 [1].

And yes, there is a typo in the description of \w and I will change that, so
instead of:

	\w		[#x0000-#x10FFFF]-[\p{P}\p{S}\p{C}]

it will read

	\w		[#x0000-#x10FFFF-[\p{P}\p{S}\p{C}]]

pvb

References
[1] http://www.unicode.org/unicode/reports/tr18/#Subtraction

Received on Tuesday, 5 December 2000 14:22:23 UTC