Re: HTML or XHTML - why do you use it? from Tantek Çelik on 2003-01-07 (www-html@w3.org from January 2003)

From: Tantek Çelik <tantek@cs.stanford.edu>
Date: Mon, 06 Jan 2003 18:59:29 -0800
To: Ian Hickson <ian@hixie.ch>
CC: "Peter Foti (PeterF)" <PeterF@systolicnetworks.com>, "'Nick Boalch'" <nick@fof.durge.org>, "'www-html@w3.org'" <www-html@w3.org>
Message-ID: <BA3F8272.1EBEB%tantek@cs.stanford.edu>
On 1/6/03 6:33 PM, "Ian Hickson" <ian@hixie.ch> wrote:

> On Mon, 6 Jan 2003, Tantek Çelik wrote:
>> On 1/6/03 2:48 PM, "Ian Hickson" <ian@hixie.ch> wrote:
>>> 
>>> my argument is that the XHTML specification was wrong to allow
>>> [XHTML sent as text/html].
>> 
>> It might be good send that feedback to the proper feedback email address
>> noted in the specification so that the working group can address it as a
>> potential errata item or change for the next version etc.
> 
> Ok, will do.

Much appreciated.


>>> XHTML documents (or rather, Appendix C compliant XHTML 1.0
>>> documents) are intended to operate in HTML Tag Soup parsers.
>>> Strictly speaking, a compliant implementation of HTML 4.01 would be
>>> well within its rights to totally reject an XHTML document, since
>>> XHTML documents are not valid HTML 4.01.
>> 
>> Ian, I have heard this assertion before, and while I would lean towards
>> believing you (since I presume you would make a thorough analysis before
>> making such a claim), it would help significantly if you could provide
>> references to ALL (that you know of at least) of the precise HTML 4.01 UA
>> compliance requirements which would require a compliant HTML4.01 UA to
>> reject a valid XHTML 1.0 document that uses the Appendix C guidelines.
> 
> None.
> 
> The HTML 4.01 spec says absolutely nothing about what to do with
> invalid documents. A UA would be compliant to the HTML 4 spec whatever
> it did.
> 
> So anything that makes an XHTML document invalid in HTML would be an
> example, including:
> 
>  The DOCTYPE.
>  The xmlns attribute.
>  The xml:lang attribute.
>  The /> syntax for empty tags.
> 
> 
>> IMHO the HTML WG should look at errata'ing any such HTML 4.01 UA
>> compliance requirements in order that a compliant HTML 4.01 UA can
>> accept valid XHTML 1.0 documents authored with the Appendix C
>> guidlines.
> 
> That's certainly an interesting idea. I can't think of any other
> things off hand, assuming Appendix-C compliance.
> 
> I'll try to compile a list of the changes that would be required.

Again, much appreciated.


>>>    UAs. Since most authors only check their documents using one or
>>>    two UAs, rather than using a validator, this means that authors
>>>    are not checking for validity, and thus most XHTML documents on
>>>    the web now are invalid. Therefore the main advantage of using
>>>    XHTML, that it has to be valid, is lost if the document is then
>>>    sent as text/html.
>>> 
>>> I am presuming that _most_ authors will fail to do so. Given the
>>> state of the Web, I feel this assumption is justified.
>> 
>> I don't doubt your assumption, just your conclusion. The advantage
>> of being able to more strictly validate a document is still there.
> 
> We've always been able to validate HTML.

Of course.  But I think XHTML has more (most of the time helpful)
constraints to ensure more consistent content, and I guess that's what I
meant by _more_ strictly validate a document.

> The key is getting UAs to
> _require_ that the documents be valid (actually, well formed, which is
> what matters the most). There isn't any chance that HTML UAs will
> _ever_ require that of text/html content.

Agreed.


>> I think the key is, that there is a desire to let HTML UAs that
>> don't support XHTML treat the markup as HTML. That is different than
>> asking for all UAs to treat the markup as HTML.
> 
> Sending markup as text/html is a signal to all UAs that the markup
> should be handled as tag soup (officially known as "HTML"), for the
> reasons given in the section labelled as "Why UAs can't handle XHTML
> sent as text/html as XML" in:
> 
>  http://www.hixie.ch/advocacy/xhtml

Oh, I don't disagree with that - what I meant was that:

There is a desire to let HTML UAs that don't support XHTML treat the markup
as HTML by sending the markup by default as text/html but sending it as
application/xhtml+xml to XHTML UAs that explicitly claim to support it.

That is different than asking for all UAs to treat the markup (sent as
text/html) as HTML.

Hopefully that helps clarify.


>>> Note that it doesn't matter how soon you intend to move to an XML
>>> MIME type; if you ever intend to, you'll hit the problems.
>> 
>> True enough. However, I believe the author can just use lower case
>> element/attribute names (even in the HTML documents and related
>> scripts), and have it just work.
> 
> To make sure XHTML works as both MIME types you have to ensure you do
> everything in appendix C, plus:
> 
>  never use <!-- --> in <script> or <style>
>  never use namespaces
>  never use PIs
>  use lowercase CSS selectors
>  explicitly include <tbody> elements
>  style the html element instead of the body element
>  compare tagnames by lowercasing them first
>  create elements in lowercase
> 
> There are probably many more things that have to be ensured. I know
> I've forgotten some of CSS's caveats.

This is an excellent list, and I think you should propose it as errata to
XHTML 1.0 Appendix C, even though some of those things don't specifically
have to do with the markup (e.g. the CSS and DOM tips).

Thanks,

Tantek
Received on Monday, 6 January 2003 21:43:45 UTC