W3C home > Mailing lists > Public > public-html@w3.org > May 2009

Re: HTML interpreter vs. HTML user agent

From: Geoffrey Sneddon <foolistbar@googlemail.com>
Date: Thu, 28 May 2009 15:39:53 +0100
Cc: "Sam Ruby" <rubys@intertwingly.net>, "Maciej Stachowiak" <mjs@apple.com>, "Roy T. Fielding" <fielding@gbiv.com>, "Larry Masinter" <masinter@adobe.com>, "HTML WG" <public-html@w3.org>
Message-Id: <6142F7FA-EAD8-424A-9F89-0B3988AF72A8@googlemail.com>
To: Anne van Kesteren <annevk@opera.com>

On 28 May 2009, at 14:57, Anne van Kesteren wrote:

> On Thu, 28 May 2009 15:41:36 +0200, Sam Ruby  
> <rubys@intertwingly.net> wrote:
>> I don't understand the "conformance with HTTP" part of the  
>> question.  I
>> believe that the current spec'ed behavior constitutes "a willful
>> violation of the HTTP specification, which requires that the
>> Content-Type headers be honored, despite implementation experience
>> showing that this is not pratical in many cases."
> Currently they completely violate HTTP. By following the rules layed  
> out in HTML5 they could get much closer. (I agree that it is  
> probably better for this part of HTML5 to end up with the IETF, but  
> I still think it would make sense for feed readers to adhere to the  
> rules as well.)
> When sniffing was discussed a while ago I remember that  
> technorati.com and a feed library gsnedders was working on made  
> their code much stricter. They're not browsers.

I can't find any reference to Technorati looking in the archives, but  
what SimplePie does (which is what I (sorta) work on) matches an old  
draft of what is in HTML 5 (the only differences don't effect our  
detection of text/html v. feed types from memory). What we did before  
that was do what Sam is describing, completely ignore the Content-Type  
header (that is actually untrue, as we used some regex like "; 
\s*charset=([^;]+)" to get a charset from it). Only a small number of  
feeds broke (all served as text/plain to my knowledge).

Geoffrey Sneddon
Received on Thursday, 28 May 2009 14:40:45 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:47 UTC