Bug: XML Schema Part 2: Datatypes Appendix E Regular Expression

I am working on an implementation of regular expression based on the
specification found in XML Schema Part 2, appendix E.

I have found one construct which is ambiguous:

  [.]

There are two possible leftmost derivations for this construct:

  regExp => branch => piece => atom => charClass => charClassExpr =>
  '[' charGroup ']' => '[' posCharGroup ']' => '[' charRange ']' =>
  '[' XmlCharInDash ']' => '[.]'

and:

  regExp => branch => piece => atom => charClass => charClassExpr =>
  '[' charGroup ']' => '[' posCharGroup ']' => '[' charClassEsc ']' =>
  '[' MultiCharEsc ']' => '[.]'
         
The first alternative gives the single character '.' while the second
gives the multi character escape 'any character'.  I expect that the
first alternative is the intended solution for the specification.

I suggest to change the specification to the following:

  [11]   charClass   ::=  charClassEsc | charClassExpr | '.'
  [37]   MultiCharEsc::=  '\' [sSiIcCdDwW]

-- 
Birdstep Technology      Sverre Hvammen Johansen
Office: +47 24 13 47 00  Direct: +47 24 13 47 76
Web: www.birdstep.com    Mobile: +47 45 03 23 74

Received on Tuesday, 11 December 2001 11:04:47 UTC