[Bug 11125] Regex grammar for 1.1 renders some 1.0 regexes invalid

http://www.w3.org/Bugs/Public/show_bug.cgi?id=11125

--- Comment #6 from C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com> 2011-01-18 15:37:10 UTC ---
In its discussion of the ambiguity in the grammar for charGroupPart, comment 5
appears to be wrong. (I'd say "it *is* wrong", but I've made so many howlers in
working over this material that I'm trying to teach myself more caution in my
conclusions.)

There is an ambiguity.  In some cases, the ambiguity makes a difference to the
semantics, and in some cases it does not.  The claim of semantic ambiguity
relies on reading the prose following production 81 as applying only to
singleCharEsc when parsed as a singleChar and not when parsed as a
charClassEsc.  This much seems correct in the analysis of comment 5.

But the difference is not \- vs \n and the other singleCharEsc strings; it's
between situations in which a charGroupPart begins with a singleCharEsc
followed by a hyphen.  So:

  [\-\n] is ambiguous in syntax but unambiguous in semantics
  [\--z] is ambiguous both syntactically and semantically
  [\n-z] is ambiguous both syntactically and semantically

At the moment, I tentatively lean toward moving the class of singleCharEsc out
of the class of charClassEsc; this would mean 

- change production 89 from 

  [89] charClassEsc ::= ( SingleCharEsc | MultiCharEsc | catEsc | complEsc )

to

  [89] charClassEsc ::= ( MultiCharEsc | catEsc | complEsc )

- change production 75 from 

  [75] charClass ::= charClassEsc | charClassExpr | WildcardEsc

to 

  [75] charClass ::= singleCharEsc | charClassEsc | charClassExpr | WildcardEsc

- leave productions 81 and 82 alone.

The changes to 89 and 75 will entail accompanying changes to the the
neighboring prose.

An alternative would be to rework 81 and 82 instead.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Tuesday, 18 January 2011 15:37:13 UTC