I've been looking through the Web recently, trying to find some
information on SGML in a particular field I'm interested in.

I recently downloaded the HTML 3.2 + additions draft DTD, and ran it
through the dtd2yacc converter for perl. I was rather horrified that the
output of that was almost 11Mb! Of course, I then wanted to put that
through yacc, so after reconfiguring my computer to have 120Mb of swap,
I left yacc running overnight. Of course, it had to fail, and did so.
After a night's worth of time, I got about 120 bytes of data back from
yacc - it couldn't handle the size of the grammar.

Basically I'm looking for a tool, like yacc, that takes the DTD on the
input and converts it into a form that can be used by a program as a
grammar. In other words, I want to be able to embed a grammar from a DTD
quickly into a program, and be able to keep it updated as and when the
latest revisions of HTML come out.

Have you got any ideas on how to do this, or what to use?

