Re: HTML interpreter vs. HTML user agent from Adam Barth on 2009-05-29 (public-html@w3.org from May 2009)

From: Adam Barth <w3c@adambarth.com>
Date: Fri, 29 May 2009 14:05:30 -0700
To: Sam Ruby <rubys@intertwingly.net>
Cc: Ian Hickson <ian@hixie.ch>, HTML WG <public-html@w3.org>
Message-ID: <7789133a0905291405ic66d526h53f9738c344c5c7a@mail.gmail.com>

On Fri, May 29, 2009 at 1:57 PM, Sam Ruby <rubys@intertwingly.net> wrote:
> Adam Barth wrote:
>>
>> On Fri, May 29, 2009 at 1:37 PM, Sam Ruby <rubys@intertwingly.net> wrote:
>>>
>>> Ian Hickson wrote:
>>>>
>>>> Could you elaborate on how this doesn't work with the rules in HTML5?
>>>> (Or
>>>> rather, with the rules in Adam's ID?)
>>>
>>> That resource is served with a text/plain mime type, and therefore should
>>> not be treated as a feed.
>>
>> Perhaps we should change the algorithm to consider these documents to be
>> feeds.
>>
>>> At the present time, both IE8 and Firefox treat that document as a feed.
>>> My assessment is that Firefox will continue to follow IE's lead in this
>>> area, but I will gladly defer to those who actually work on the product.
>>
>> That sounds like another argument for changing the algorithm.
>
> At the present time Chrome will not treat that document as a feed.

How can you tell?  Chrome doesn't current support feeds.

> My two cents: generally these rules were created not based on first
> principles, but rather based on reverse engineering a small number of
> browsers.

The rules are not based on first principles in the sense that we could
easily live in an alternate reality similar to our own but with a
different sniffing algorithm.  The rules are based on two things:

1) Reverse engineering of existing user agents.
2) Extensive empirical analysis of existing web content.

For more details about how the algorithm was constructed, I encourage
you to read:

http://www.adambarth.com/papers/2009/barth-caballero-song.pdf

>From these observations, we formulated two security principles, but
the paper explains that better than I can reproduce here.

> If it is intended that these rules apply to another class of
> tools (e.g. feed readers), then I will suggest that reverse engineering the
> behavior of the top "n" (n>=3) consumers in that category would seem
> appropriate.

Ian and I have spend much more effort on the other sniffing rules than
on the rules for sniffing feeds.  I'd welcome feedback on how to
improve the feed rules (well, and all the rules, actually).  Your
earlier feedback was quite helpful.

Adam

Received on Friday, 29 May 2009 21:06:23 UTC