DTD Fragments and XML

Let's say that I am an XML user. I am happy with, say, the DocBook DTD,
but need to insert a chemical formula in CML format. Or perhaps I want
to insert something more mundane, some small element that does not have
an expression in DocBook: <GRADE> for a student's grade on a project. In
the SGML world I would combine the two DTDs manually. This is probably a
painful process of examining content models and parameter entities and
finding the right place to shoe-horn in my element type. 

I don't think that we can really expect that to happen often in the XML
world. People will just remove the DOCTYPE line and depend on
well-formedness. But having removed the DOCTYPE line, they have now
taken all responsibility for the semantics of that document upon
themselves. It can no longer be validated. The user agent cannot use any
"hard coded" knowledge of the semantics (with no doctype it doesn't know
what namespace the gis are from). "Alternate" stylesheets (e.g. text to
speech) are no longer useful (same problem). Search engines cannot
depend on the meta-data to be accurate. In short, I've hobbled the
interoperability of the document. We are all hurt by "tag soup", but
perhaps the visually impaired are the most hurt by it. I'm sure they
don't want to surf the web by continually reconfiguring their browsers
from scratch because authors have not properly declared their
namespaces.

In my mind this is where the rubber meets the road. The rubber is
generic markup, where the needs of the data are paramount: add an
element if you need it. The road is the Internet where interoperability
is paramount. It seems to me that in XML it is too hard to balance these
factors.

I don't think that we can make it easy to combine DTDs without changing
SGML. But maybe we can figure out a way to declare a namespace for
elements: to "import" element names in a standard way. You wouldn't be
able to validate the document but at least it would be clear what the
elements MEAN.

 Paul Prescod

Received on Saturday, 3 May 1997 07:47:53 UTC