RE: Sniffing XHTML sent as text/html from Jelks Cabaniss on 2000-10-03 (www-html@w3.org from October 2000)

From: Jelks Cabaniss <jelks@jelks.nu>
Date: Mon, 2 Oct 2000 21:08:17 -0400
To: <www-html@w3.org>
Message-ID: <NBBBICMNIPCICMKJECCBOEJODNAA.jelks@jelks.nu>

Dan Connolly mentioned (some while ago):

>  http://www.ietf.org/rfc/rfc2854.txt

where it says (among other things):


    In addition, [XHTML1] defines a profile of use of XHTML which is
    compatible with HTML 4.01 and which may also be labeled as
    text/html.

    ...

    XHTML documents (optionally) start with an XML declaration which
    begins with "<?xml" and are required to have a DOCTYPE declaration
    "<!DOCTYPE html".

    ...

    The use of an explicit charset parameter is strongly recommended.
    While [MIME] specifies "The default character set, which must be
    assumed in the absence of a charset parameter, is US-ASCII."  [HTTP]
    Section 3.7.1, defines that "media subtypes of the 'text' type are
    defined to have a default charset value of 'ISO-8859-1'".  Section
    19.3 of [HTTP] gives additional guidelines.  Using an explicit
    charset parameter will help avoid confusion.

    Using an explicit charset parameter also takes into account that the
    overwhelming majority of deployed browsers are set to use something
    else than 'ISO-8859-1' as the default; the actual default is either a
    corporate character encoding or character encodings widely deployed
    in a certain national or regional community. For further
    considerations, please also see Section 5.2 of [HTML40].


This should perhaps also mention that XHTML documents are (being XML) by
default UTF-8 if you omit the XML declaration.  How that is reconciled with
text/* defaulting to ISO-8859-1, I'm not sure.  Perhaps it's a further
indication that text/html may be unsuitable for XHTML.

?


/Jelks

Received on Monday, 2 October 2000 21:12:39 UTC