[Prev][Next][Index][Thread]

Re: XML and required DTDs



At 10:14 AM 09/15/96 +0100, Martin Bryan wrote:

>- a known set of elements, with a DTD that is defined but not referenced,
>that can be extended in a DOCTYPE subset, e.g.
>
><!DOCTYPE my-dtd [<ATTLIST #ALL style NAME #FIXED "normal">]>
>
>My idea here is to adopt an HTML approach of saying that there are a certain
>set of elements that are useful in a wide range of situations (the basic
>text elements) which all browsers would have default presentation
>characteristics for, and for which there is an agreed set of model
>relationships, which the user can modify by:

Standardizing tag libraries is often useful, but has nothing to do with
simplifying the language. I think the point of the "to DTD or not to DTD"
discussion is that there is sooooooo little in SGML that prevents parsing
perfectly without a DTD, that is seems a tad silly to force *every*
application to take on the (currently huge) task of implementing all of the
DTD stuff (the overwhelming majority of work in building a parser, as it is
the overwhelming majority of the SGML grammar, and books on SGML, and so on).

The question as I see it is: Why *force* everyone into a Hobson's choice:
either add a huge amount to your implementation effort, or parse some
documents *just slightly* wrong? Why force a 5-line Perl script to parse a
DTD (try doing that in 5 lines!), track every content model, and all that,
just so it can "ignore" the occasional RE that follows an included
sub-element, but not that follows a proper sub-element? I say, we shouldn't.
Because if we try, many will decide the slight difference isn't worth the
pain, and then we'll end up with just slightly inoperable systems -- yech!

The only things blocking you from parsing one-entity minimal SGML document
without a DTD are:

   a) Declared content (CDATA/RCDATA/EMPTY elements)
   b) The RE-ignoring rules.

All of these constructs, interestingly, have other problems besides forcing
you to have a DTD. If we address the other problems, we get DTD-less parsing
capability for *free*. Sounds like a pretty nice bonus to me. Obviously if
you want to use additional constructs, you must declare them (like external
entities, attribute defaults, and notations). But why force people to
declare more than they inherently must?



Follow-Ups: