In Appendix F, Regular Expressions, of XML Schema Part 2: Datatypes, there is a problem with the definition of the \w character sequence. ([39] MultiCharEsc). The subtracted character class given is [\p{P}\p{S}\p{C}], but the text describes this as "(all characters except the set of "punctuation", "separator" and "control" characters)." In the subtracted character class, the sequence \p{S} would eliminate the "symbol" characters, not the "separator" characters. Should this have been \p{Z}? Or is the explanation simply a misinterpretation of \p{S}? Also, the use of "control" for \p{C} is misleading since the control characters are a subset of "Other" characters, that is, \p{Cc} selects the "control" characters. --Bruce D. SinclairReceived on Thursday, 22 March 2001 19:07:08 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:50 GMT