W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > October to December 2000

Surrogate blocks as block escapes in CR-xmlschema-2-20001024

From: Tony Graham <tgraham@mulberrytech.com>
Date: Fri, 24 Nov 2000 16:56:32 -0400 (EST)
Message-ID: <14878.54784.719000.991360@menteith.com>
To: www-xml-schema-comments@w3.org
The regular expression syntax includes block and category escapes of
the form '\p{IsX}', where 'X' is either a one- or two-character
character property identifier or a Unicode character block name with
spaces stripped out.

The table of character properties in the CR excludes the 'Cs' property
and notes that "surrogate" characters 'do not occur at the level of
the "character abstraction" that XML instance documents operate on.'

The CR refers to the Unicode 3.0 blocks but does not list them.  The
Unicode 3.0 blocks include three that cover the Surrogates area: "High
Surrogate", "High Private Use Surrogates" and "Low Surrogates".

Since Surrogates 'do not occur at the level of the "character
abstraction" that XML instance documents operate on', should the CR
note that the surrogate-related blocks should not be used in block
escapes in XML Schema regular expressions?

Regards,


Tony Graham
======================================================================
Tony Graham                            mailto:tgraham@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9632
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
Received on Friday, 24 November 2000 16:56:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:49 GMT