Comments on 31 March spec
Comments on the 31 March draft. Some substantive, some nits,
intermixed for your pleasure.
 Comment: Where did the stars go? Have como and comc evaporated
from the anticipated SGML TC?
Clause 2.8: I have to make a token request to PLEASE make this section
go away. It's really, really, gross, and incompatible with SGML.
What's better? I don't know, exactly.
 RMDecl: The S non-terminal isn't a link to its definition.
Clause 3.1: The terminology here is a bit confusing. "The beginning
of every XML element is marked by a start-tag." That applies to empty
elements as well as non-empty ones. The implication, however, is that
the STag production is equivalent to the concept represented by the
term "start-tag", which is not true. I would either change the
production name to something like NESTag or be explicit in the prose
that STag is the production for a start-tag of a non-empty element.
Clause 3.2: "At user option, an XML processor may issue a warning when
a reference is made to an element type for which no declaration is
provided, but this is not an error." I think that this should be a
validity violation, and a reportable error for a validating processor.
 elements: Why not make this
elements ::= cps
Otherwise, one has to scrutinize the productions to see that a simple
element name fills the production for seq without the optional (','
cps)* and thus fulfills elements.
Clause 3.2.1: "The optional character following a name...
respectively". The list given is *not* respective, and even if it
were it would force the reader to draw the correspondence himself.
List which characters mean what, explicitly, and be done with it with
much less chance of error.
Clause 3.2.1: The example declaration for head is missing a semicolon
Clause 3.2.2: The second example declaration for p is missing
semicolons for all entity references.
Clause 3.3: "At user option... if attributes are declared for an
entity type..." This should be "element type".
Clause 3.3.3: "Record-ends must..." What are record ends? You've
just introduced one of the ugliest fellows in the SGML bestiary with
no explanation whatsoever. Make this "Line-end characters, or on some
systems, record boundaries, must...".
Clause 4.2: These examples strongly imply that quoting a reference
prevents its recognition. Don't use meta-examples; try
Type less-than (<) to save options.
This document is classified &seclevel;.
Clause 4.3.1: I don't like the opening wording. How about,"If the
entity definition is an EntityValue, the defined entity is called an
 ExternalDef: Why not %NDataDecl? ?
Clause 4.3.2: There should be a digression or appendix on escaping
characters in system identifiers. There is frequent confusion in HTML
over when to URL-escape a character and when to SGML-escape it. For
instance, to query the engine http://foo.com/cgi-bin/search for the
poem with id jack&jill and the company at&t, one must URL-escape the
meaningful (to URLs) ampersand as %26: jack%26jill, at%26t. One then
creates a URL:
Then, to include this URL as an href attribute, one must SGML-escape
the meaningful (to SGML) ampersand as &
This quickly exceeds the understanding of most users, and should be
explicit in the XML spec.
Clause 4.3.3: The example encoding uses EUC-JIS instead of EUC-JP.
Clause 4.4: "For character references, the processor must pass the
indicated ISO 10646 bit pattern to the application..." You presume to
tell me what bits to pass? I think your EBCDIC application would not
like me to do that. How about, "... pass the character corresponding
to the indicated ISO 10646 bit pattern..."
Clause 4.4: "In the replacement text of an internal entity,..." This
seems a bit wordy and confusing. I'd suggest explicitly stating that
this replacement happens upon encountering the declaration in the DTD
- that tidbit is missing.
Clause 4.6: I *strongly* feel that MIME types should be allowed for
notation system IDs. Unless someone can send me a URL for a GIF
processor that will work immediately on any OS. Change %ExternalID in
production  to %NotExtId, which in turn is:
[73a] NotExtId :: = 'SYSTEM' S (MimeType | SystemLiteral)
Appendix A: This makes me wonder: Why *aren't* spaces allowed in
marked sections? The S? token creeps in to nearly every other
production; why not CDATA and conditional marked sections, too?
Appendix A: "... depending on the character set..." No, no, NO!!!
ALL XML documents *must* have the same character set, in the SGML
sense, or the numeric character references are trash. They may have
different encodings or BCTFs, but the *character set* is ALWAYS the
same. This prose must be cleaned up ASAP, or we'll be haunted by
incompatible applications for XML's entire brief life.
Appendix A: The SGML declaration appears to be missing the underscore
that the productions allow as a name start.
Appendix C: The full content is not only self-describing, but
 Markup: The DOCTYPE subset uses the Comment non-terminal; why
doesn't the earlier appearance of comments?
Christopher R. Maden One Richmond Square
DynaText SIT Technical Support Providence, RI 02906 USA
Inso Corporation +1.401.421.9550 (voice)
Electronic Publishing Solutions +1.401.521.2030 (facsimile)