XML 1.0 3e spec, Parameter entity references

(I hope someone still reads this list:  I note the archive is 99% spam 
:-(  Please will someone acknowledge actually reading this submission?)

With reference to:
   http://www.w3.org/TR/2004/REC-xml-20040204
[[
Extensible Markup Language (XML) 1.0 (Third Edition)
W3C Recommendation 04 February 2004
]]

I'm having a real problem getting to grips with the handling of parameter 
entity references.  The document appears to be contradicting itself.

My problem concerns where parameter entity references (PEReference [69]) 
can appear in an external DTD subset.  Tracing all occurrences of 
PEReference in the formal grammar, the only places where it may appear are:
   [9]   in an EntityValue
   [28a] in a DeclSep

which would be fine, except that, in section 2.8, I see:
[[
Parameter entity references are recognized anywhere in the DTD (internal 
and external subsets and external parameter entities), except in literals, 
processing instructions, comments, and the contents of ignored conditional 
sections (see 3.4 Conditional Sections). They are also recognized in entity 
value literals. The use of parameter entities in the internal subset is 
restricted as described below.
]]

and

[[
The external subset and external parameter entities also differ from the 
internal subset in that in them, parameter-entity references are permitted 
within markup declarations, not only between markup declarations.
]]

What am I to believe?

If the formal grammar is correct, then I think the text quoted above is (at 
best) misleading.   If the text is correct then the grammar is incorrect.

...

Further, I have a concern that, if the text is correct, there is an unholy 
interaction between lexical analysis, syntax analysis and parameter entity 
replacement.  Consider:

ExhibitA.xml:
[[
<!DOCTYPE doc [
<!ELEMENT doc (#PCDATA)>
<!ENTITY % e1 SYSTEM "ExhibitE1.ent">
<!ENTITY % e2 SYSTEM "ExhibitE2.ent">
<!ATTLIST doc a1 CDATA "v1">
%e1;
%e2;
<!ATTLIST doc a2 CDATA "v2">
]>
<doc></doc>
]]

ExhibitE1.ent
[[
<!ENTITY % x '<!ATTLIST doc a3 CDATA "v3">'>
%x;
]]

ExhibitE2.ent
[[
<!ENTITY % y 'CDATA'>
<!ATTLIST doc a4 %y; "v4">
]]

With the inclusion of ExhibitE1.ent, it is relatively easy to handle the PE 
replacement after parsing, since the grammar yields well-defined places 
where the entity reference may appear.

But in the case of ExhibitE2.ent, which the text appears to suggest is 
valid, it is not possible to successfully parse the content until the 
substitution of &y; has been performed.  But it's not possible to know the 
value of y to be substituted until the external entity has been parsed to 
find the EntityDecl [70].

#g


------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact

Received on Thursday, 3 June 2004 15:54:31 UTC