[Prev][Next][Index][Thread]

partial DTDs [was: Re: B.10 Empty elements?]



I really don't like this "partial DTD" idea.

(1) as Bill pointed out, you have to deal with distinguishing partial
    and full DTDs.
    That includes needing tools for merging changes made to a partial DTD
    back into a full one.  Ugh.

(2) if you can handle a partial DTD, you can handle a full one.  Having
    less of something rarely makes the code much simpler!
    So we would be requiring all XML programs to handle DTDs, with
    their gruesome (sorry) syntax and arcane restrictions, both syntactic
    (no comments in model groups) and semantic (ambiguity, only one ID, etc).

So we have to decide whether to
(A) make every XML program a full SGML application, with a DTD and
    everything that can go into it (including a man-year of coding!)

(B) make every XML program have a parser for a reduced DTD grammar,
    such as the one Charles proposed (and that for some insane reason I
    turned into a BNF grammar)

(C) throw away the whole kit and bundle and have no DTD

(D) if there is no SGML DTD, consider the possibility of an SGML
    representation of the structured information and documentation that
    are required to make a DTD usable, whether or not there is in fact
    a formal DTD.

I would very much like to see a way of interchanging natural language
element and attribute descriptions, for example.  Style sheet editors are
hard to use if you think TITLE refers to the book's title, and don't
realise it is supposed to contain Dr., Earl, Lord, etc.... A prototype
version of HoTMetaL had "Author" as the description for the <A> element!
(the person who did that has since been promoted, don't worry)

It would also be useful to support something like HyLex, to give a regular
expression that textual data must match -- but with a POSIX regular
expression, *please*, so I can implement it in an hour instead of a week
if I have to write an NDFA matcher instead of using Henry's libregexp!!!

But since neither element descriptions nor regular expressions for text
or attributes are part of SGML, there's no need to put them in a DTD.

Another approach, then, might be
    SGML syntax with optional constraints
where the constraints can be expressed simply, perhaps even in an XML subset,
and documents can be parsed (in the computer science sense) without the
constraints, and further checked (more like SGML validation) with them.

Without a DTD, though, you have
* no distinction between element context and mixed content
* no endtagless EMPTY elements
* no CONREF (already voted no on that, I think)
* default attribute values must be a program convention (like #IMPLIED,
  although giving a default in a DTD may still be useful to humans, and in
  keeping software up to date)

Note: I have avoided the term "XML application" since "SGML application"
      is used to mean both a DTD and a piece of software...
      
Lee