Re: Is ist always possible to define a DTD for Well Formed XML Docume nts? from W. Eliot Kimber on 1997-05-23 (w3c-sgml-wg@w3.org from May 1997)

From: W. Eliot Kimber <eliot@isogen.com>
Date: Thu, 22 May 1997 20:46:51 -0500
To: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>, W3C SGML Working Group <w3c-sgml-wg@w3.org>
Message-Id: <3.0.32.19970522204555.00779d58@swbell.net>

At 06:49 PM 5/22/97 CDT, Michael Sperberg-McQueen wrote:
>
>Whether these programs produce 'real', or 'good' DTDs in the sense that
>I believe you are subliminally using these terms, I cannot tell you,
>since I doubt seriously that a meaningful definition can be given that
>allows 'real' DTDs to be distinguished reliably (formally, mechanically)
>from DTDs in which every element has a content model of ANY.

Deriving a "good" DTD from an instance is exactly the same problem as
deriving a DTD from a non-SGML source: i.e., classical SGML document
analysis.  Document analysis is a creative task that could only be replaced
by a program that engaged one or humans in a dialog about the document type
in question.  It would probably be possible to capture the basic heuristics
used by a some representative set of analysists and thereby create an
automated process that would ask useful questions, but there is no way to
eliminate the humans from the process except in the most trivial of cases.

It's also the case that doing document analysis, while a creative act, is,
for most cases, not a particularly difficult one, such that there's not a
great deal of benefit in going to the trouble of automating it.

Cheers,

E.

Received on Thursday, 22 May 1997 21:49:30 UTC