Re: Why Microsoft's authoritative=true won't work and is a bad idea from Ian Hickson on 2008-07-06 (public-html@w3.org from July 2008)

From: Ian Hickson <ian@hixie.ch>
Date: Sun, 6 Jul 2008 22:19:20 +0000 (UTC)
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Sam Ruby <rubys@us.ibm.com>, HTTP Working Group <ietf-http-wg@w3.org>, "public-html@w3.org" <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0807062207580.11215@hixie.dreamhostps.com>
On Sun, 6 Jul 2008, Julian Reschke wrote:
> >
> > The precise set is the set that is compatible with rendering the 
> > legacy content as expected, the minimal subset compatible with what 
> > browsers do. It can also be changed in response to browser feedback 
> > when it is discovered that it isn't quite perfect. It is far easier to 
> > incrementally move towards a set that is trying to be compatible with 
> > what the browsers already do than it is to get the browsers to jump to 
> > an extreme.
> 
> I wouldn't consider trusting the server supplied content type an 
> "extreme."

Compared to the status quo, it is an extreme. (If you consider the 
possible implementation space as a multidimensional phase space, and 
consider the current implementations are points in phase space, they are 
all relatively close to each other, and close to HTML5. The position that 
involves no sniffing at all, whether that be HTTP-compliance or this new 
authoritative=true parameter, is far, far from the browsers.)


> > > This leads to the question: what is the essential difference between 
> > > "text/plain" as defined by the spec and therefore is presumed to be 
> > > workable (despite all the evidence to the contrary), and 
> > > "authoritative=true" which is being rejected out of hand as 
> > > unworkable.
> > 
> > text/plain might not be workable. If Opera and Safari find they have 
> > to change as well, then the spec will have to change too.
> 
> ...I don't think this answers Sam's question. What's the difference 
> between considering the encoding as input, but not another parameter?

I've explained multiple times the difference is not in the syntax but in 
the delta from the status quo to the behaviour required by the two 
proposals. One is relatively close to where we are now, and by making 
minor changes to browsers and specs, we can reach an equilibrium. The 
other is so far away that only large changes will reach interoperability, 
and such changes aren't stable, since they would happen over a long time 
period and would result in a large body of legacy content that is 
mislabelled, thus leading us right back into a content-sniffing world as 
we are today.


On Sun, 6 Jul 2008, Julian Reschke wrote:
> > >
> > > Another factor to consider is that the http working group is 
> > > concerned with more user agents than browsers.
> > 
> > I should hope everyone is. However, that doesn't change anything -- 
> > it's still the same ecosystem, and the same content. We don't want 
> > tools treating content different than each other, whether they are Web 
> > browsers or not. ...
> 
> Now this is something I totally can agree with.
> 
> In which case I'm not sure why it's the HTML working group working on 
> this. Seems that W3C and IETF should collaborate on this one.

I would aboslutely love it if the relevant groups would take this stuff 
and specify it themselves. However, the HTTP group has already indicated 
that they have no intention of defining the content sniffing rules 
required to be compatible with legacy content. (This is just like the URL 
issue, where the URI group indicated no intention to update the URI specs 
to be compatible with legacy content.) I've no intention of playing blame- 
laying games; if the HTTP group doesn't want to do the work, then we will 
instead. If the HTTP group decides to do the work, I would be very happy 
to remove this stuff from the HTML5 spec.


On Sun, 6 Jul 2008, Julian Reschke wrote:
> > ... If you would like the document to be processed as plain text, then 
> > there might not be a good answer for you, sorry. Your use case is 
> > incompatible with the use case of the many users who want to see feeds 
> > sent as text/plain handled as feeds. Enough people mislabel their 
> > feeds as text/plain that in practice documents labeled as text/plain 
> > are, in some browsers, sniffed for feeds before being treated as plain 
> > text. ...
> 
> With the current text in HTML5, there's not only no "good answer" but no 
> answer at all (except by telling users to configure their UAs to respect 
> mime types).

This problem has nothing to do with the spec, since the spec currently 
requires text/plain to be honoured in this case.

The "bad" answer is for Sam to stuff the top of this text/plain feeds with 
filler content that doesn't get sniffed, so that the sniffing heuristics 
in IE and Firefox get tricked into not seeing the feed content. (So, there 
_is_ an answer, it's just not a good one.)


> Sam's use case could be made compatible by making the response 
> distinguishable from one sent by a misconfigured server.

How is that possible?


> At this point it seems to me that you are simply not interested in that 
> case. Is this correct?

I would love sniffing to go away altogether. I'm so interested in this 
particular use case that HTML5 in fact supports it _despite_ this 
requiring changes from the two biggest browsers. What more can I do?

However, if said browsers ignore me, then I'm not going to just stick my 
head in the sand and pretend like all is well -- the spec will change to 
align with reality. At the end of the day, it's not up to me.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Sunday, 6 July 2008 22:20:01 UTC