- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Thu, 27 Jan 2011 13:25:03 +0100
- To: Anne van Kesteren <annevk@opera.com>
- Cc: Julian Reschke <julian.reschke@gmx.de>, Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, "public-html@w3.org" <public-html@w3.org>
Anne van Kesteren, Thu, 27 Jan 2011 10:39:03 +0100: > On Thu, 27 Jan 2011 08:28:35 +0100, Leif Halvard Silli > <xn--mlform-iua@målform.no> wrote: >> Anne, HTML5's 'encoding sniffing algorithm' [1] uses the 'algorithm for >> extracting an encoding from a Content-Type' [2] twice: >> >> 1) before parsing: on Content-Type meta data (HTTP). [1] > > It is not used here. It just generically refers to the "Content-Type > metadata" and does not define how you extract it. If you are correct, then where does HTML5 specify how to handle the HTTP Content-Type header? Note that the section says: ]] The Content-Type metadata of a resource must be obtained and interpreted in a manner consistent with the requirements of the Media Type Sniffing specification. [MIMESNIFF] [[ And that MIMESNIFF "describes an algorithm for determining the effective media type of HTTP responses". [1] > (You are right that > the algorithm is used twice, but both times it operates on the > text/html stream, not on any external data.) Same question as above. My view is that the same algorithm is first used on the HTTP Content-Type and then, if necessary, on the HTTP-EQUIV Content-Type. Yes, it looks to me as if MIMESNIFF blurs the border between the HTTP header and "512 octets or more" of the text/html stream. But never the less, MIMESNIFF means HTTP's Content-Type - it does not speak about HTTP-EQUIV. MIMESNIFF also says that: ]] If the user agent is configured to strictly obey the official- type, then let the sniffed-type be the official-type and abort these steps. [[ In which case there would not be any Content-Type any other place than in the one that were obtained from the HTTP header. And also, in the encoding sniffing algorithm of HTML5, then the 512 octets is step 3 - after the 'transport layer' - we must assume that the 'transport layer' is HTTP, and thus the same as the "official-type" in the MIMESNIFF draft. (I must say that HTML5 could have described these things better. As is, one must jump back and forth and think ... For instance, the encoding sniffing algorithm is described twice. Once in the two first paragraphs of section 8.2.2.1 and once again in the subsequent outline of the algorithm. [2]) [1] http://tools.ietf.org/html/draft-abarth-mime-sniff-06 [2] http://www.w3.org/TR/html5/parsing#determining-the-character-encoding -- leif halvard silli
Received on Thursday, 27 January 2011 12:25:40 UTC