Re: equivalent power in SGML and XML

Martin Bryan <mtbryan@sgml.u-net.com> wrote:

> The set of documents I would like to see "grandfathered in" is that set
> defined in (valid) HTML. If an XML browser cannot read that set of raw HTML
> documents that are valid according to the 2.0 DTD (or later versions) then
> it will not be of much practical use, and will be ignored by the majority of
> potential users.

That may be biting off more than we wish to chew...

HTML 2.0 makes liberal use of end-tag omission, a couple cases
of start-tag omission, contains elements with CDATA, RCDATA, and
EMPTY declared content, uses inclusion exceptions, and encourages
omitted attribute name minimization.  It does disallow most other forms
of SHORTTAG minimization by application convention, and in "Strict"
or "Recommended" mode HTML 2.0 does not use CDATA or RCDATA declared 
content (though HTML 3.2 does), but still...

(FWIW, I agree with you that XML should be able to handle HTML --
not necessarily HTML 2.0, but possibly a similar, slightly stricter DTD --
but the general consensus seems to be that the cost of supporting
the necessary features is too high.)

> (We would still have the problem of the 60% of invalid documents

More like 96% invalid from what I've seen...

> , but hopefully this situation will get better once standard WP
> tools start offering automatic conversion to HTML.)

--Joe English