Proposed additions to XHTML 1.0 Appendix C (was Re: HTML or XHTML - why do you use it?)

On 1/6/03 6:33 PM, "Ian Hickson" <ian@hixie.ch> wrote:

> To make sure XHTML works as both MIME types you have to ensure you do
> everything in appendix C, plus:


Many of these are already covered by Appendix C apparently.

 http://www.w3.org/TR/2002/REC-xhtml1-20020801/#guidelines

E.g.:

>  never use <!-- --> in <script> or <style>

C.4. Embedded Style Sheets and Scripts

"Note that XML parsers are permitted to silently remove the contents of
comments. Therefore, the historical practice of "hiding" scripts and style
sheets within "comments" to make the documents backward compatible is likely
to not work as expected in XML-based user agents."


>  never use namespaces

That's not in Appendix C, and I'm not sure that it should be (in spite of
any opinions I may or may not have about namespaces in general ;-)

Simply specifying a namespace with xmlns is fine, since conforming HTML4
processors are required to _ignore_ unrecognized attributes rather than
rejecting the document (as XML user agents are required to).

Are you talking about using colonized HTML tag names and attributes?

If so, then I would agree with rephrasing your suggestion to:

- never use colonized HTML tag names and attributes

<ins class="proposed">
C.x Namespaces

Since HTML UAs are not expected to recognize colonized tag names and
attributes, authors should avoid using colonized HTML tag names and
attributes.
</ins>

>  never use PIs

PIs are valid SGML.  Unrecognized PIs are merely supposed to be ignored.
There should be no problem here for a _strictly_conforming_ HTML4 UA.

Also note "C.1. Processing Instructions and the XML Declaration" already
provides a warning about this for sensitive HTML4 UAs.


>  use lowercase CSS selectors

C.13. Cascading Style Sheets (CSS) and XHTML

"CSS style sheets for XHTML should use lower case element and attribute
names."


>  explicitly include <tbody> elements

The issue is noted in C.11 in the context of scripting, and the solution is
explicitly stated in C.13 in the context of CSS.

 "Therefore you should always explicitly add a tbody element if it is
referred to in a CSS selector."

I agree that this should be made more explicit, perhaps under its own
heading.  Something like:

<ins class="proposed">
C.x  Implied/Optional elements

Certain elements were implied in HTML4 but are now optional in XHTML, e.g.
(e.g. the tbody element within table).  This can result in an inconsistent
parse tree [infoset?] and thus the behavior of DOM or CSS applied to the
document may be different when document is parsed as HTML vs. XHTML.  To
help ensure a consistent parse tree [infoset?] when a document is parsed as
either HTML or XHTML, authors should explicitly include such
implied/optional elements.
</ins>


>  style the html element instead of the body element

Agreed, and this is new.  This should be added to section C.13.

<ins class="proposed">
Authors should style the html element instead of the body element.
</ins>


>  compare tagnames by lowercasing them first
>  create elements in lowercase

Agreed with both of these, and this should be explicitly stated in "C.11.
Document Object Model and XHTML".  E.g. in a paragraph after the contained
ordered list:

<ins class="proposed">
Thus, for example, scripts which compare tagnames should perform
lowercasefolding before the comparison, and scripts should also create
elements in lowercase.
</ins>


> There are probably many more things that have to be ensured.
> I know I've forgotten some of CSS's caveats.

I'm not sure about "many more".  There has been some considerable effort
made by a number of folks to come up with a thorough list, and while there
are likely to be a few specific things that were forgotten (as you have
pointed out), I think it is unlikely that a significant more exist (or maybe
I'm just being an optimist.)


Thanks again, and of course, please _do_ point out any more suggested
Appendix C additions that you think of.


Tantek

Received on Wednesday, 8 January 2003 11:22:08 UTC