review of content type rules by IETF/HTTP community

The Feed/HTML sniffing review comment reminded me... since
the scope of the HTML 5 spec overlaps with the scope
of the HTTP spec, we should get review by the IETF/HTTP
community (including the W3C TAG).

I just packaged the relevant section
  http://www.w3.org/html/wg/html5/#content-type-sniffing
as an Internet Draft-to-be, with this introduction:


---8<---

The HTTP specification[HTTP], in section 14.17 Content-Type, says The
Content-Type entity-header field indicates the media type of the
entity-body sent to the recipient.

The HTML 5 specification[HTML5] specifies an algorithm for determining
content types based on widely deployed practices and software.

These specifications conflict in some cases. (@@ extract a test cases
from Step 10 of Feed/HTML sniffing (part of detailed review of
"Determining the type of a new resource in a browsing context"))

According to a straightforward architecture for content types in the
Web[META], the HTTP specification should suffice and the HTML 5
specification need not specify another algorithm. But that architecture
assumes that Web publishers (server adminstrators and content
developers) reliably label content. Observing that labelling by Web
publishers is widely unreliable, and software that works around these
problems is widespread, the choices seem to be:

      * Convince Web publishers to fix incorrectly labelled Web content
        and label it correctly in the future.
      * Update the HTTP specification to match widely deployed
        conventions captured in the HTML 5 draft.

While the second option is unappealing, the first option seems
infeasible.

The IETF community is invited to review the details of the HTML 5
algorithm in detail.

---8<---

The full text is...

http://dev.w3.org/cvsweb/~checkout~/html5/cts/html5-type-sniffing.html?rev=1.1&content-type=text/html;%20charset=iso-8859-1
Revision: 1.1 of 2007/08/17 20:35:38
http://dev.w3.org/cvsweb/html5/cts/

Note also: I'm looking for a co-author to help route feedback from
the IETF to the W3C HTML WG.

And the formatting needs some work.

I'll stand by for comments for a few days (at least) before I 
submit this for publication as an Internet Draft.

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/

Received on Friday, 17 August 2007 20:46:11 UTC