- From: Grosso, Paul <pgrosso@ptc.com>
- Date: Thu, 20 Oct 2011 09:28:24 -0400
- To: <public-xml-core-wg@w3.org>
-----Original Message----- From: xml-editor-request@w3.org [mailto:xml-editor-request@w3.org] On Behalf Of Daniel van Vugt Sent: Thursday, 2011 October 20 2:20 To: xml-editor@w3.org Subject: Errata in section 2.4 of Extensible Markup Language (XML) 1.0 (Fifth Edition) ERROR #1: Ambiguous grammar These rules make the grammar ambiguous: [14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*) [43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)* CharData is allowed to match an empty string due to its use of "*". However CharData is referenced as CharData? meaning this potentially empty string is optional. Therefore, if content is blank, it is ambiguous as to whether CharData is matched as the empty string or if CharData is omitted completely. Functionally this is low severity. However grammar parsers such as my own will find both interpretations and treat it as an error because the grammar is ambiguous. The fix is simple. Change: [14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*) to: [14] CharData ::= [^<&]+ - ([^<&]* ']]>' [^<&]*) ERROR #2: CharData supports, and doesn't support, character references Section 2.4 seems to suggest that Character Data may contain character references such as &. However at the same time, the grammar rule [14] for CharData does not appear to be able to match ampersand character references at all: [14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*) Regards, Daniel van Vugt
Received on Thursday, 20 October 2011 13:32:41 UTC