Derived DTDs

At 01:43 PM 10/2/96 -0400, David G. Durand"  (David G. Durand wrote:

>  I agree with the general move to DTD-less processing, but I think that we
>should make a requirement on all XML parsers: that they be _capable_ of
>creating an XML DTD given a DTD-less instance.

I agree that this would be a desirable facility, and furthermore
a desirable ground on which implementors could compete, but if it's a good
idea the market will do it, and if it isn't they won't, and I don't think
the XML standard will be made better by saying they should do this; anyone
can trivially comply with a vacuous content model, so the normative effect
is essentially zero.

For what it's worth, and even though FRED is cool, this is one road I've
been way, way down, and people should be warned that while extracting a DTD
from a de-facto parse tree is pretty trivial, turning it into a *good*
DTD (e.g. figuring out where to put the parentheses and Kleene operators)
is not just a hard problem, it is monstrously, horribly, intractable.  When
I was doing this a few years back I couldn't find any theory in the CS
literature on derived-grammar simplification, so maybe someone has cut
through this Gordian knot, but kids, don't try to do this at home.

Cheers, Tim Bray
tbray@textuality.com http://www.textuality.com/ +1-604-488-1167