XML Specification Errors

I have recently been reviewing the XML specification (REC-xml-1998-0210).

First, I must say that I find it extremely useful to work with
the on-line hypertext version.   

However, I have one major area of concern with respect to the
adequacy of the specification.  In brief,  I find that the
specifications is incomplete and confusing with respect to the
issue of parameter-entity references and the external declaration
subset.

In general, the specification seems to give a full grammatical
treatment of parameter-entity references as they may occur
in the main document ("document entity").  However, the grammar is
incomplete with respect to the positioning of parameter-entity 
references in the external declaration subset.  Grammatically,
they may occur in entity values or between conditional sections
and markup declarations (production [31]).    However, there
is also a vague assertion that parameter-entity references
may occur also occur "within" markup declarations  in the
external subset.   

An example of this vagueness is the grammar for conditionalSect
(section 3.4).   The only two "keywords" allowed by the grammar 
are 'INCLUDE' and 'IGNORE', while the accompanying text (and 
the examples following) show that the keyword may be
a parameter-entity reference instead.

Section 4.4 continues the vagueness, particularly with the
statement about the recognition context "Reference in DTD".
This context relates to a "reference within either the internal
or external subsets of the DTD".    

The best clue to the role of parameter-entity references
in the external subset seems to be in section 4.4.8.  Apparently, 
they can appear anywhere, provided that they produce a 
valid stream of tokens for the context.  However, it is a poor
descriptive technique to define a language by the actions that
a processor may take in expansion.

On the other hand, an older draft at http://www.w3.org/TR/WD-xml-970807.html
gives some other clues by marking each of the nonterminals
in place of which a parameter-entity reference can occur by
a %.  However, this suggests that parameter-entity references
do not generate arbitrary token sequences, rather token sequences
that must match a single nonterminal.

Perhaps those with a thorough knowledge of SGML will understand
the rules for parameter-entity references in the external
subset,  but it would be good if the XML document could
stand on its own.

Some minor concerns:

1. In the example in section 4.4.5, I believe the text
   "&YN;" should instead be "%YN;".

2. In section 2.4, there is an implication that the ampersand 
   is legal (in its literal form) within the literal entity
   value of an internal entity declaration.  This appears to
   contradict production [9] for EntityValue.  There is also
   a mysterious reference to section 4.3.2.

The "errata document" at http://www.w3.org/XML/xml-19980210-errata 
apparently not been updated since February 11.   Should I be looking 
somewhere else?

Robert D. Cameron, Associate Professor           cameron@cs.sfu.ca
School of Computing Science                      FAX: (604) 291-3045
Simon Fraser University
Burnaby, B.C., Canada  V5A 1S6

Received on Wednesday, 8 July 1998 20:01:45 UTC