- From: Tony Graham <tgraham@mulberrytech.com>
- Date: Tue, 5 Dec 2000 17:49:34 -0400 (EST)
- To: <www-xml-schema-comments@w3.org>
At 5 Dec 2000 15:46 -0500, Matt Timmermans wrote: > > -----Original Message----- > > From: www-xml-schema-comments-request@w3.org > > > > So, I think you are correct, I will change those code point > > references to > > XML character references, hence > > > > \s [ \t\n\r] > > \w [�--[\p{P}\p{S}\p{C}]] > > > > (note, the typo correct in \w's expansion, as noted in my > > answer to James' > > message to this list this morning [1]). > > I believe the problem was that those character references aren't XML chars, > i.e.: > > \w [	--[\p{P}\p{S}\p{C}]] No, Paul is correct: my problem was that the equivalent character classes did not use correct regular expression syntax. The equivalent character class could also be: \w [ --[\p{P}\p{S}\p{C}]] The interesting question about the equivalent character class is whether or not it excludes code points from the Surrogate block. Since the "Cs" value of the General Category field of the Unicode Character Database is not listed in the table of character classes in the CR, does \p{C} really include the Surrogate code points? The answer is probably that it doesn't have to, since surrogates 'do not occur at the level of "character abstraction" that XML instance documents operate on.' Regards, Tony Graham ====================================================================== Tony Graham mailto:tgraham@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9632 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
Received on Tuesday, 5 December 2000 17:54:22 UTC