CAmbiguities in section 4.3.2

With respect to section 4.3.2 of the XML 1.0 Specification Second Edition
and by implication XML 1.1 CR, there appear to be several ambiguities
engendered by the following statement:

  An internal general parsed entity is well-formed if its replacement
  text matches the production labeled content.

... when considered in the context of CDATA Sections and "]]>". For example,
this would imply that the following declaration in an internal DTD subset
would result in an internal general parsed entity that is not well-formed:

<!ENTITY cdend "]]>">

... because the replacement text does not match the [43] content production.

If so ...

1) This contradicts statements about Literals in section 2.3, namely:

  Literal data is any quoted string not containing the quotation mark
  used as a delimiter for that string. Literals are used for
  specifying the content of internal entities (EntityValue),

... and production [9] EntityValue. Production [9] permits "]]>" as a
replacement text.

Furthermore, [10] AttValue also permits "]]>". It would be nonsensical for
<foo bar="]]>"/> to be well-formed, but not <foo bar="&cdend;"/>, using the
entity declaration above.

2) This contradicts the last paragraph of section 4.3.2:

  A consequence of well-formedness in entities is that the logical
  and physical structures in an XML document are properly nested; no
  start-tag, end-tag, empty-element tag, element, comment, processing
  instruction, character reference, or entity reference can begin in
  one entity and end in another.

The list appears to be intended to be exhaustive. The lack of "CDATA
Section" in the list might be interpreted to mean that you can start a CDATA 
Section in one entity, and end it in another. Therefore, the declaration of
&cdend; above should be well-formed.

===============================

Since the well-formedness of internal general parsed entities is completely
defined by productions [71] GEDecl, [73] EntityDef, and [9] EntityValue,
what is the value of the statement in section 4.3.2? What does it intend to
clarify?

Perry A. Caro
Adobe Systems Incorporated

Received on Monday, 11 August 2003 19:55:08 UTC