Re: ISSUE-125 CCP -- change the "willful violation" note -- rev 1 from Leif Halvard Silli on 2011-01-27 (public-html@w3.org from January 2011)

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Thu, 27 Jan 2011 13:25:03 +0100
To: Anne van Kesteren <annevk@opera.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, "public-html@w3.org" <public-html@w3.org>
Message-ID: <20110127132503797994.b6bb449e@xn--mlform-iua.no>

Anne van Kesteren, Thu, 27 Jan 2011 10:39:03 +0100:
> On Thu, 27 Jan 2011 08:28:35 +0100, Leif Halvard Silli 
> <xn--mlform-iua@målform.no> wrote:
>> Anne, HTML5's 'encoding sniffing algorithm' [1] uses the 'algorithm for
>> extracting an encoding from a Content-Type' [2] twice:
>> 
>>  1) before parsing: on Content-Type meta data (HTTP). [1]
> 
> It is not used here. It just generically refers to the "Content-Type 
> metadata" and does not define how you extract it.

If you are correct, then where does HTML5 specify how to handle the 
HTTP Content-Type header? 

Note that the section says:

 ]] The Content-Type metadata of a resource must be obtained and 
interpreted in a manner consistent with the requirements of the Media 
Type Sniffing specification. [MIMESNIFF] [[

And that MIMESNIFF "describes an algorithm for determining the 
effective media type of HTTP responses". [1]

> (You are right that 
> the algorithm is used twice, but both times it operates on the 
> text/html stream, not on any external data.)

Same question as above. My view is that the same algorithm is first 
used on the HTTP Content-Type and then, if necessary, on the HTTP-EQUIV 
Content-Type. Yes, it looks to me as if MIMESNIFF blurs the border 
between the HTTP header and "512 octets or more" of the text/html 
stream. But never the less, MIMESNIFF means HTTP's Content-Type - it 
does not speak about HTTP-EQUIV. MIMESNIFF also says that: 

]] If the user agent is configured to strictly obey the official-
   type, then let the sniffed-type be the official-type and abort
   these steps. [[

In which case there would not be any Content-Type any other place than 
in the one that were obtained from the HTTP header.

And also, in the encoding sniffing algorithm of HTML5, then the 512 
octets is step 3 - after the 'transport layer' - we must assume that 
the 'transport layer' is HTTP, and thus the same as the "official-type" 
in the MIMESNIFF draft.

(I must say that HTML5 could have described these things better. As is, 
one must jump back and forth and think ... For instance, the encoding 
sniffing algorithm is described twice. Once in the two first paragraphs 
of section 8.2.2.1 and once again in the subsequent outline of the 
algorithm. [2])

[1] http://tools.ietf.org/html/draft-abarth-mime-sniff-06

[2] 
http://www.w3.org/TR/html5/parsing#determining-the-character-encoding

-- 
leif halvard silli

Received on Thursday, 27 January 2011 12:25:40 UTC