Re: Why Microsoft's authoritative=true won't work and is a bad idea from Julian Reschke on 2008-07-07 (ietf-http-wg@w3.org from July to September 2008)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Mon, 07 Jul 2008 09:33:09 +0200
To: Ian Hickson <ian@hixie.ch>
CC: Sam Ruby <rubys@us.ibm.com>, HTTP Working Group <ietf-http-wg@w3.org>, "public-html@w3.org" <public-html@w3.org>
Message-ID: <4871C6B5.7000601@gmx.de>
Ian Hickson wrote:
>> I wouldn't consider trusting the server supplied content type an 
>> "extreme."
> 
> Compared to the status quo, it is an extreme. (If you consider the 
> possible implementation space as a multidimensional phase space, and 
> consider the current implementations are points in phase space, they are 
> all relatively close to each other, and close to HTML5. The position that 
> involves no sniffing at all, whether that be HTTP-compliance or this new 
> authoritative=true parameter, is far, far from the browsers.)

It's an "extreme" that is currently allowed in HTML5, remember?

"If the user agent is configured to strictly obey Content-Type headers 
for this resource, then jump to the last step in this set of steps." -- 
<http://www.w3.org/html/wg/html5/#content-type0>

>> ...I don't think this answers Sam's question. What's the difference 
>> between considering the encoding as input, but not another parameter?
> 
> I've explained multiple times the difference is not in the syntax but in 
> the delta from the status quo to the behaviour required by the two 
> proposals. One is relatively close to where we are now, and by making 
> minor changes to browsers and specs, we can reach an equilibrium. The 
> other is so far away that only large changes will reach interoperability, 
> and such changes aren't stable, since they would happen over a long time 
> period and would result in a large body of legacy content that is 
> mislabelled, thus leading us right back into a content-sniffing world as 
> we are today.

It seems you are satisfied with the equilibrium HTML5 defines. Others 
are not, for instance Microsoft.

Many think that the information supplied by the server must be treated 
as authoritative, thus want to reach a *different* equilibrium. That may 
require more changes, but this doesn't mean it can't be done (despite 
what you say).

> On Sun, 6 Jul 2008, Julian Reschke wrote:
>>>> Another factor to consider is that the http working group is 
>>>> concerned with more user agents than browsers.
>>> I should hope everyone is. However, that doesn't change anything -- 
>>> it's still the same ecosystem, and the same content. We don't want 
>>> tools treating content different than each other, whether they are Web 
>>> browsers or not. ...
>> Now this is something I totally can agree with.
>>
>> In which case I'm not sure why it's the HTML working group working on 
>> this. Seems that W3C and IETF should collaborate on this one.
> 
> I would aboslutely love it if the relevant groups would take this stuff 
> and specify it themselves. However, the HTTP group has already indicated 

With "it", what exactly do you mean? The thing these groups will agree 
on, or the thing you prefer personally?

> that they have no intention of defining the content sniffing rules 
> required to be compatible with legacy content. (This is just like the URL 

The IETF HTTPbis working group has no mandate to do so. Thus it would 
need to be rechartered, or a new WG would have to start.

> issue, where the URI group indicated no intention to update the URI specs 
> to be compatible with legacy content.) I've no intention of playing blame- 
> laying games; if the HTTP group doesn't want to do the work, then we will 
> instead. If the HTTP group decides to do the work, I would be very happy 
> to remove this stuff from the HTML5 spec.

There is no "URI group" -- there's a list of people subscribed to the 
URI mailing list. That being said, I haven't seen *any* kind of 
consensus that RFC3986 should be changed. I've seen some discussion 
about whether RFC3987bis should expand on the "LEIRI" topic, and it 
seems Martin Dürst was considering that input.

The difference between the sniffing issue and the URI issue is this: 
what a content-type means is totally relevant outside the HTML context; 
how an HTTP response is to be processed needs to be the same everywhere.

On the other hand, what lexical format HTML5 allows internally is 
primarily a problem for the HTML WG to decide. It just needs to define 
how the internal format maps to URI/IRI.

>> With the current text in HTML5, there's not only no "good answer" but no 
>> answer at all (except by telling users to configure their UAs to respect 
>> mime types).
> 
> This problem has nothing to do with the spec, since the spec currently 
> requires text/plain to be honoured in this case.
> 
> The "bad" answer is for Sam to stuff the top of this text/plain feeds with 
> filler content that doesn't get sniffed, so that the sniffing heuristics 
> in IE and Firefox get tricked into not seeing the feed content. (So, there 
> _is_ an answer, it's just not a good one.)

That may be a workaround that works in this case, but I doubt it's 
universally applicable.

>> Sam's use case could be made compatible by making the response 
>> distinguishable from one sent by a misconfigured server.
> 
> How is that possible?

Using Microsoft's proposal or by using a separate header, for instance.

>> At this point it seems to me that you are simply not interested in that 
>> case. Is this correct?
> 
> I would love sniffing to go away altogether. I'm so interested in this 
> particular use case that HTML5 in fact supports it _despite_ this 
> requiring changes from the two biggest browsers. What more can I do?
> 
> However, if said browsers ignore me, then I'm not going to just stick my 
> head in the sand and pretend like all is well -- the spec will change to 
> align with reality. At the end of the day, it's not up to me.

Well, the biggest vendor just put a proposal on the table that would 
make it possible to disable sniffing altogether.

Maybe it would make sense to consider it seriously, instead of 
immediately stating "won't work"?

BR, Julian
Received on Monday, 7 July 2008 07:33:59 UTC