Ambiguity in section 2.8 of XML 1.0 Fifth Edition

Unrelated to my previous email about an ambiguity, I have found another 
one in section 2.8 this time...

[28a] DeclSep ::= PEReference | S
[28b] intSubset	::= (markupdecl | DeclSep)*
[3] S ::=  (#x20 | #x9 | #xD | #xA)+

intSubset is ambiguous because it allows repetitions (*) of DeclSep. And 
DeclSep can match the S rule which also allows repetitions (+).

Therefore the offset and length of each S used to make up DeclSep in 
intSubset is ambiguous. There are many different solutions if intSubset 
is given a string of multiple whitespace characters.

I suggest the following correction, which appears to eliminate the 
ambiguity:

[3a] S1 ::=  #x20 | #x9 | #xD | #xA
[3b] S ::=  S1+
[28a] DeclSep ::= PEReference | S1
[28b] intSubset	::= (markupdecl | DeclSep)*

I have confirmed this fix resolves the ambiguity using my own parser.

Regards,

Daniel van Vugt

Received on Friday, 21 October 2011 06:16:44 UTC