W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > January to March 2001

XML Schema Part 2: Datatypes

From: Bruce D. Sinclair <b.sinclair@liant.com>
Date: Thu, 22 Mar 2001 18:07:15 -0600
To: <www-xml-schema-comments@w3.org>
Message-Id: <01Mar22.180738cst.115202@jumpgate.liant.com>
In Appendix F, Regular Expressions, of XML Schema Part 2: Datatypes, there is a
problem with the definition of the \w character sequence. ([39] MultiCharEsc).
The subtracted character class given is [\p{P}\p{S}\p{C}], but the text
describes this as "(all characters except the set of "punctuation", "separator"
and "control" characters)."

In the subtracted character class, the sequence \p{S} would eliminate the
"symbol" characters, not the "separator" characters.  Should this have been
\p{Z}?  Or is the explanation simply a misinterpretation of \p{S}?

Also, the use of "control" for \p{C} is misleading since the control characters
are a subset of "Other" characters, that is, \p{Cc} selects the "control"
characters.

--Bruce D. Sinclair
Received on Thursday, 22 March 2001 19:07:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:50 GMT