W3C home > Mailing lists > Public > public-xml-core-wg@w3.org > October 2011

FW: Ambiguity in section 2.8 of XML 1.0 Fifth Edition

From: Grosso, Paul <pgrosso@ptc.com>
Date: Fri, 21 Oct 2011 11:02:36 -0400
Message-ID: <9B2DE9094C827E44988F5ADAA6A2C5DA03E1F2D1@HQ-MAIL9.ptcnet.ptc.com>
To: <public-xml-core-wg@w3.org>
Another comment on (parsing) ambiguity.

Did we ever say or imply that the productions in the spec
were non-ambiguous?  Is the right response to these issues
simply that we never said the productions were non-ambiguous,
and if a parser writer wants or needs to translate them into
equivalent non-ambiguous versions, that's fine, but there is
nothing wrong with the productions in the spec?

paul

-----Original Message-----
From: xml-editor-request@w3.org [mailto:xml-editor-request@w3.org] On
Behalf Of Daniel van Vugt
Sent: Friday, 2011 October 21 1:15
To: xml-editor@w3.org
Subject: Ambiguity in section 2.8 of XML 1.0 Fifth Edition

Unrelated to my previous email about an ambiguity, I have found another 
one in section 2.8 this time...

[28a] DeclSep ::= PEReference | S
[28b] intSubset	::= (markupdecl | DeclSep)*
[3] S ::=  (#x20 | #x9 | #xD | #xA)+

intSubset is ambiguous because it allows repetitions (*) of DeclSep. And

DeclSep can match the S rule which also allows repetitions (+).

Therefore the offset and length of each S used to make up DeclSep in 
intSubset is ambiguous. There are many different solutions if intSubset 
is given a string of multiple whitespace characters.

I suggest the following correction, which appears to eliminate the 
ambiguity:

[3a] S1 ::=  #x20 | #x9 | #xD | #xA
[3b] S ::=  S1+
[28a] DeclSep ::= PEReference | S1
[28b] intSubset	::= (markupdecl | DeclSep)*

I have confirmed this fix resolves the ambiguity using my own parser.

Regards,

Daniel van Vugt
Received on Friday, 21 October 2011 15:03:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 21 October 2011 15:03:26 GMT