XML Schema Part 2: Datatypes

In Appendix F, Regular Expressions, of XML Schema Part 2: Datatypes, there is a
problem with the definition of the \w character sequence. ([39] MultiCharEsc).
The subtracted character class given is [\p{P}\p{S}\p{C}], but the text
describes this as "(all characters except the set of "punctuation", "separator"
and "control" characters)."

In the subtracted character class, the sequence \p{S} would eliminate the
"symbol" characters, not the "separator" characters.  Should this have been
\p{Z}?  Or is the explanation simply a misinterpretation of \p{S}?

Also, the use of "control" for \p{C} is misleading since the control characters
are a subset of "Other" characters, that is, \p{Cc} selects the "control"
characters.

--Bruce D. Sinclair

Received on Thursday, 22 March 2001 19:07:08 UTC