XML and required DTDs

Steve DeRose and others have made a good case for removing the burden
of reading a DTD from the shoulders of XML applications, especially if
reading an XML DTD is to be no easier than reading an arbitrary SGML
(ISO 8879) DTD.  I don't know how the equation changes if XML DTDs are
"easy" to parse, by comparison, but I understand the motivation.

Still, I am wondering how this strategy would fit into the larger
picture: (a) how XML instances will actually be processed, and (b)
what dynamics will likely be set up for XML applications development
if the pronouncement is made: "Rejoice at last, programmers of the
world: today we are liberated from the tyranny of the SGML DTD."

To elaborate on this question, I first need to ensure that I have not
made a fatal mistake in assuming that "Language" in "XML" to is meant
in the same way as "Language" in "SGML," viz., elliptical for
"metalanguage."  I have taken "eXtensible" to be the equivalent of
"meta-".  Further, I have assumed that the markup languages based upon
XML would set up basic lexical semantics as well as can be done using
semantically perspicuous GIs, attribute names, attribute values
(etc.), but that XML itself would be as innocent of processing
semantics just like SGML.  If both assumptions are warranted, and if I
have not failed to reckon with some influence from "SGML Extended
Facilities" (e.g., property sets that are assumed to apply to XML in
some way), then I have the following observations and questions.

Observation: if XML instances are to be processed without the DTD,
then it's not possible to know with certainty from an instance what
the data types (declared values) are for the attributes.  This --
aside from the fact that some declarations might need to be sent with
the XML instance to provide information about defaulted attributes,
notations, etc.  In this sense, an XML document instance is weaker
semantically than an SGML document with its required DTD.  See below,
where the attributes 'uff' or 'ko' might encode an IDREF as (HyTime) clink,
but we can't be sure.

So, suppose this XML instance is acquired by my Net client:

<ucod>xx<kok uff=ak>xx<iuk>xx</iuk>xx<kak>xx<kcico>xx</kcico>xx</kak>xx<voc>
<cbb>xx</cbb>xx<qdd ko=ak>xx</qdd>xx<koik>xx</koik>xx</voc>xx</kok>xx<cdaw>
<koik>xx</koik>xx<voc>xx<qdd>xx</qdd>xx<koik>xx<qkqr>xx<riwo>xx</riwo>
</qkqr>xx<riwo>xx</riwo>xx</koik>xx</voc>xx</cdaw>xx<kob>xx</kob>xx</ucod>

If the client assumes (or can know from a MIME type?) that the instance
is a candidate for meaningful display, what does it do?  (Ignore the
fact that the content represented by "xx" is dummy text).  The server
sending the instance will also need to send a stylesheet which says
that (e.g.,) "qdd is a line-breaking element" and "qkqr is to be
displayed in italic typeface" and so forth.  Something also has to
tell my browser that QDD's attribute 'ko' encodes a HyTime clink, and
that 'uff' is (therefore) for storing an ID.  Etc.

Thus, I still don't understand the point of enabling "DTDless"
processing of XML instances, as per Tim Bray's posting #172, where
"search, display, analyze" are examples of such processing, *if* it is
necessary to have and process, in addition to the instance:

    * a rendering stylesheet
    * a collection of declarations to account for attribute defaulting,
         notations, etc)
    * a set of other specification of semantics like (HyTime) reftype, clink,
         which express fundamental relational semantics that are not
         expected to be in a specific stylesheet

Granted that it's nice to be able to create a parse tree for an XML
instance simply by having the instance in hand: what's the point of
all this economy if we can't make any useful sense of the tree without
also having (and processing) the stylesheet, as well as having (and
processing) the other declarations that tell us how to interpret the
tree?  It seems to me ironic that a revised SGML could be depicted in
this way: "here's the instance, here's the stylesheet, here's a small
collection of declarations to let you know about defaulted attrs and
notations, and here's a set of HyTime mappings based upon archForms
semantics to augment the stuff in the stylesheet -- what, you want the
full XML DTD too?  -- naw, sorry: irrelevant and too expensive."

Someone says: (1) "No need to ship a stylesheet: the XML document can
just reference one, and if it's common, it should be on the receiving
system's local machine."  OK: why not also for the DTD?  Or: (2)
"Naw, we assume a common tag set, so that we know <p> means
'paragraph' and <li> means 'list item'."  OK: how then is XML much
of an improvement over HTML?

I still maintain that there is a broad range of applications for which
*having the DTD* (as opposed to the parse tree for the exact XML
instance that just arrived), or for which *having the option of
addressing the DTD* rather than the instance parse tree, allows a great
deal more interesting and meaningful processing.  I know I don't stand
a chance of convincing anyone on the significance of this point so I
won't try.  But I fully expect that *most* XML applications will
ignore markup declaration processing if XML says it's possible to have
meaningful processing without recognizing declarations.  What worries
me most is that XML will lead to a fixed language with semantics (by
virtue of a fixed -- yet chaotic/anarchical -- tag set, just like
"HTML"), and thus, a fixed not-very-extensible "language" instead of a
metalanguage.


-robin

Received on Tuesday, 17 September 1996 10:33:32 UTC