W3C home > Mailing lists > Public > www-html-editor@w3.org > January to March 1999

Comments on WD-html-in-xml-19990224

From: John Cowan <cowan@locke.ccil.org>
Date: Fri, 26 Feb 1999 15:07:14 -0500
Message-ID: <36D6FEF2.9E92495E@locke.ccil.org>
To: www-html-editor@w3.org
CC: XML Dev <xml-dev@ic.ac.uk>
1) I believe that the introduction of a media type "text/xhtml" is
a mistake.  Instead, it would be better to attach a media-type
attribute specifying the formal public identifier of the DTD.
This would be allowed on either "text/xml" or "text/html", by
the appropriate IETF process.

Here's an example:

	text/xml; dtd="-//W3C//DTD XTHML 1.0 Strict//EN"

This would distinguish XHTML from HTML, different versions of HTML
from each other, different DTDs of HTML from each other, and would
continue to be useful in future for new XML document types.

2) The border attribute of the img element is declared %Length;
in the Transitional DTD, but %Pixels; in the Frameset DTD.
These are both aliases for CDATA, but post-XML validation may
need to distinguish them.

3) There is no need to have fully separate DTDs for Frameset and
Traditional; they can share using conditional sections, just as
in HTML 4.0.

4) There are a variety of useless differences between
the Frameset and Traditional DTDs, mostly involving whitespace.
If the two DTDs are kept separate, these should be ironed out.

5) The LanguageCode parameter entity is defined as "NAME" in HTML 4.0,
but "CDATA" in XHTML.  The strictest equivalent of NAME in XML is
NMTOKEN, which should be used.

6) SGML rules remove up to one line boundary at the start and/or
the end of an element's content.  Equivalently, up to one line
boundary each is removed after a start-tag and before an end-tag.
This rule affects the style, script, and pre elements, especially
pre, and should be stated in the main document, as XML-based
systems will have to emulate it.

7) In the comment preceding the OLStyle parameter entity:
for "arablic" read "arabic".

8) Since XML is case-sensitive, OLStyle can be explicitly
defined as "(1|a|A|i|I)".  That would not work in HTML 4.0 because
a and A, and i and I, would look the same to SGML.  Consequently,
LIStyle can be explicitly defined as "(%ULStyle;|%OLStyle;)"
and the corresponding comment removed.

9) The content model of table does not match HTML 4.0, which requires
the TBODY element but allows both start-tag and end-tag to be
omitted.  The precedent established by XHTML for the head and body
elements is that such elements must appear explicitly.  The table
element, however, allows either tbody or tr+ after the optional
thead and tfoot.  This should be changed to just tbody.

If it is decided not to do this (on HTML 3.2 compatibility grounds
or otherwise), the design decision should be documented.

I have not reviewed the XHTML Strict DTD at this time.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)
Received on Friday, 26 February 1999 15:08:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:44 GMT