Re: let authors choose text/html or application/xhtml+xml (detailed review of section 1. Introduction) from Robert Burns on 2007-08-31 (public-html@w3.org from August 2007)

From: Robert Burns <rob@robburns.com>
Date: Fri, 31 Aug 2007 10:12:58 -0500
To: Sam Ruby <rubys@us.ibm.com>
Cc: Dean Edridge <dean@55.co.nz>, Dan Connolly <connolly@w3.org>, "public-html@w3.org WG" <public-html@w3.org>
Message-Id: <9C32B4AE-65FF-4219-AA6E-33FB44486089@robburns.com>

On Aug 31, 2007, at 8:52 AM, Sam Ruby wrote:

>
> Dean Edridge wrote:
>>
>> As soon as the document is given the media type "text/html" it  
>> becomes a HTML document, simple as that.
>
> Unless, of course, said document happens to contain the the  
> following bytes in the first 512 octets:
>
>   0x3C 0x72 0x73 0x73

Or if its a PNG or if its a JPEG or if its etc. As we're learning,  
browsers are sniffing content all over the place.

>
> I continue to believe that the specification should define a  
> canonical media type of "application/html" for the SGML inspired  
> serialization of HTML5 and then proceed to define appropriate  
> content sniffing rules for "text/html".
>
> Furthermore "application/html" should join "text/plain" and  
> "application/xhtml+xml" as content types that are *never* sniffed.

 From the data we're gathering on the wiki, "text/plain" is not a  
content type that is never sniffed [1]. Because of the long-standing  
Apache bug, where 'text/plain' is sent when Apache doesn't know the  
content type, it is one of the more important types to sniff.

>
> Of course, browsers that don't care to maintain a distinction  
> between "text/html" and "application/html" are welcome to do so --  
> as long as their support for "application/html" conforms to the  
> html5 specification.  Other, more conservative, browsers may chose  
> to maintain dual paths for either a brief, or even extended, period  
> of time, gated mainly by how close html5 as spec'ed is to html as  
> practiced.
>
> In order to accommodate those who are unable to configure their  
> servers correctly, one of the sniffing rules for determining if a  
> payload served as "text/html" is actually "application/html" should  
> involve the <meta http-equiv="Content-Type"> tag.

The filename extension could be '.html5' or just '.5'. :-)

>
> And finally, in an attempt to reduce the chances that introducing  
> another mime type is ever needed again, both "application/html" and  
> "application/xhtml+xml" should have architected means of be capable  
> of being extended by anybody for any purpose.

Also related tot he HTTP thread and the wiki page, there's been some  
expression of interest in having some HTTP header that identifies  
what sub-types, schema, or namespaces a document includes beyond what  
its MIME type might imply. Though this supports what you're saying it  
also suggest more information is needed for communication between  
server and client than just MIME  types (especially for extensible  
types).

Take care,
Rob

[1]: <http://esw.w3.org/topic/HTML/ContentTypeIssues>

Received on Friday, 31 August 2007 15:13:12 UTC