Re: equivalent power in SGML and XML
> I just don't think [deleting OMITTAG] is practical.
> HTML relies OMITTAG (notably for P, LI, DT, DD).
> I don't see how we can possibly achieve our ease
> of implementation goals if we support OMITTAG in XML.
I can certainly see how we could make HoTMetaL Pro have an option to
save HTML files in XML. It isn't that easy to suggest a general converter
using SGML tools, since most HTML documents are not valid according to
any particular HTML DTD. But I can imagine public domain conversion utilities
that could be fairly painless.
Would that be too much of a compromise?
With no SHORTTAG and no OMITTAG and no EMPTY elements and no MIXED content,
XML is becoming about as far from HTML as HTML is from RTF :-)
It seems to me that limited versions of some of these things could usefully
* allow an omitted end tag immediately before an end tag:
<P><em>stuff</P> (is this worth it? it's easy to parse)
* allow </> (easy to parse, and safer than NET)
* allow mixed content with | but not with ",", so as to avoid the difficulties
associated with whitespace in element context in a mixed content model
being taken as PCDATA. It is a mistake to think that using pseudo-elements
solves very much, I think. Consider:
<P><text>here is my</text>
where in fact the newlines should be inside the <text> elements, not
between <text> and <em>. This is hard to explain.
If comments and processing instructions make this complex, delete them and
use elements instead. Heck, you can use elements instead of
marked sections and attributes and entities and get a much cleaner syntax!
Think of combining the wonderful work done by Tommie and Debbie on Pinnacles
Reflections with the careful TEI WSD spec... entities as they could have
been :-) -- as elements, using ID/IDREF to `insert' them.
* consider a naming convention for EMPTY elements, if they are allowed at
all -- e.g. <E.BR>, <E.HR> etc. -- if SGML is to be modified to allow
end tags on EMPTY elements, this is probably irrelevant.
* allow default attributes to be omitted, so that arch forms can be used
(HTML IMG has 22 attributes according to HoTMetaL Pro 3.0, and almost all
of them have defailt values. (a few are actually #IMPLIED I think).
Having to put DIR="ltr" on eery element would be a pain, for example...