W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > April to June 2000

Comments on regular expressions

From: TAMURA Kent <kent@trl.ibm.co.jp>
Date: Mon, 10 Apr 2000 16:06:46 +0900
Message-Id: <200004100706.QAA31636@ns.trl.ibm.com>
To: www-xml-schema-comments@w3.org
CC: kent@trl.ibm.co.jp
I have some comments on the regular expressions section in the
last call draft [1].

[1] http://www.w3.org/TR/2000/WD-xmlschema-2-20000407/#regexs

Re: The entire
It is hard to know concrete syntax of the regular expression
from the draft.  I want readable rules like BNF.

Re: Character class subtraction
> [Definition:] A character class subtraction is a character
> class expression subtracted from a positive character group or
> negative character group, using the - character.

This definition does not explain how to use '-'.  The next
paragraph says "G-C is a valid character class subtraction", but
there are no restriction on other usages of '-', like "GC-", "-GC" :-)

Re: '-' in character range
A '-' in a character class has many meanings.  So,
interpretation of '-' can be ambiguous.  For example:


We can interpret this character class as:
	a) '+' to '-', and '/'
	b) '+', and '-' to '/'

Re: Definition of multi-character escape
"\w" is defined as "[&#x0000;-&#xFFFF;]-[\p{P}\p{S}\p{C}]", but
both of &#x0000; and &#xFFFF; are invalid character references
in XML.
I don't know characters in &#x10000;-&#x10FFFF; should be in "\w".

TAMURA Kent @ Tokyo Research Laboratory, IBM
Received on Monday, 10 April 2000 03:07:26 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:08:47 UTC