W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > October to December 2010

[Bug 11125] Regex grammar for 1.1 renders some 1.0 regexes invalid

From: <bugzilla@jessica.w3.org>
Date: Thu, 04 Nov 2010 16:46:27 +0000
To: www-xml-schema-comments@w3.org
Message-Id: <E1PE2xH-0001XV-Dg@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11125

--- Comment #4 from David Ezell <David_E3@VERIFONE.com> 2010-11-04 16:46:26 UTC ---
in Lyon, we examined the 1.0 and 1.1 specs, we mulled over the behavior of four
specific examples.  Things ok in 1.0 but x in 1.1 are backward incompatible:
       behavior in 1.0, 1.1
[-+]               ok   ok
[+-]               ok   x
[a-z+-]            ok   x
[a-z-+]            x    ok
[--z]              ok   x
[a--k--z]          ok   x        

Problem is with paragraph following productin 81:

<quote>
If a charGroupPart starts with a singleChar and this is immediately followed by
a hyphen, and if the hyphen is part of the character group 

<problem>(that is, it is not being treated as a subtraction operator because it
is followed by '['),
</problem>

then the hyphen must be followed by another singleChar, and the sequence
(singleChar, hyphen, singleChar) is treated as a charRange. It is an error if
either of the two singleChars in a charRange is a SingleCharNoEsc comprising an
unescaped hyphen.

</quote>

MK has suggested we need to accomodate the case where the '-' is the final
character of a charGroup.

MSM observes that we might implement that by saying "the final character of a
charGroup is followed either by ']' or by '-[', but that more work is needed.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 4 November 2010 16:46:29 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 4 November 2010 16:46:30 GMT