W3C home > Mailing lists > Public > xml-editor@w3.org > October to December 2011

Ambiguity in section 2.8 of XML 1.0 Fifth Edition

From: Daniel van Vugt <vanvugt@gmail.com>
Date: Fri, 21 Oct 2011 14:14:58 +0800
Message-ID: <4EA10DE2.4050903@gmail.com>
To: xml-editor@w3.org
Unrelated to my previous email about an ambiguity, I have found another 
one in section 2.8 this time...

[28a] DeclSep ::= PEReference | S
[28b] intSubset	::= (markupdecl | DeclSep)*
[3] S ::=  (#x20 | #x9 | #xD | #xA)+

intSubset is ambiguous because it allows repetitions (*) of DeclSep. And 
DeclSep can match the S rule which also allows repetitions (+).

Therefore the offset and length of each S used to make up DeclSep in 
intSubset is ambiguous. There are many different solutions if intSubset 
is given a string of multiple whitespace characters.

I suggest the following correction, which appears to eliminate the 
ambiguity:

[3a] S1 ::=  #x20 | #x9 | #xD | #xA
[3b] S ::=  S1+
[28a] DeclSep ::= PEReference | S1
[28b] intSubset	::= (markupdecl | DeclSep)*

I have confirmed this fix resolves the ambiguity using my own parser.

Regards,

Daniel van Vugt
Received on Friday, 21 October 2011 06:16:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 21 October 2011 06:16:50 GMT