The robustness principle, as view by user agent implementors (Re: Working Group Last Call: draft-ietf-httpbis-content-disp-02) from Adam Barth on 2010-10-03 (ietf-http-wg@w3.org from October to December 2010)

From: Adam Barth <w3c@adambarth.com>
Date: Sun, 3 Oct 2010 12:14:37 -0700
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <AANLkTimRbrxvX7k0_cmbBS0XEZzjkMm5z5kNZFPePce0@mail.gmail.com>
Thanks for your thoughtful message.  There's a cultural gap here that
sometimes makes it difficult to understand why user agent implementors
have a different perspective from server operators on some issues.  In
this message, I'll try to explain why we have the perspective we do.
Note that I'm not trying to convince you to adopt our perspective, I'm
just hoping this message will lead to more understanding.

For the remainder of this email, I'd like to ask you to drop all value
judgements about what's "good" or "bad", "valid" or "invalid."  These
are social constructs that we layer on top of the underlying bits and
bytes.  I'll try to present the issue in a value-neutral way, and I
hope you read it in that way.

Imagine, for a moment, that there's a certain network message that
servers send to user agents.  For the sake of discusion, let's call it
the Foo header.  Let's assume there are a number of different
semantics implemented by the different user agents.  The set of all
possible Foo messages breaks down as follows:

All messages
    >=
Messages that are meaningful in at least one semantics
    >=
Messages with the same meaning in all semantics

Now, let's assume servers both generate messages and have an intended
semantics in mind.  Of course, we can't observe the servers intended
semantics, but let's imagine each server has some semantics in mind
for its messages.  Furthermore, let's make the simplifying assumption
that the intended semantics for each message is implemented by at
least one user agent (i.e., the server isn't totally insane).

All messages
  >=
Messages generated by at least one server
  >=
Messages generated by at least one server that can be interpreted with
a single semantic theory (this set is not unique)
  >=
Messages generated that have the same meaning in all semantics

For the time being, lets assume that the existing entities (both user
agents and servers) are immutable.  How is a new entrant into this
market incentivized to behave?

A server wishes the message it generates to be interpreted the same
way by all user agents (or as many as feasible).  If the set of
messages with the same meaning in all semantics is large enough, a new
server entrant will be incentivized to generate messages in that
subset.  Conversely, a user agent wishes understand the intended
semantics of the most number of messages.  Therefore, the user agent
is incentivized to process the largest set of messages generated by at
least one server that can be interpreted by a single semantic theory.

Notice that the semantics selected by the new server entrant is a
subset of the semantics selected by the new user agent entrant.
There's no reason to believe these sets will be equal.  To the extent
that standards explain to new entrants how they ought to behave,
servers wish to know about the set of messages with the same meaning
in all semantics, but user agents wish to know about the largest set
of messages generated by at least one server that can be interpreted
with a single semantic theory.

(There's a little more to the story to explain why user agents also
want a complete semantic theory, one that defines the semantics of all
messages, but I can explain that in another message.)

Adam


On Sat, Oct 2, 2010 at 5:59 PM, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:
> * Adam Barth wrote:
>>Insulting an important constituency is unlikely to generate consensus
>>by win that constituency over to your point of view.
>
> There is a popular german short story by Peter Bichsel, "Ein Tisch ist
> ein Tisch" (a desk is a desk). He tells of an old man who is trapped in
> his daily routine, everything the same all day. One day he takes a walk
> and for no particular reason he realizes he likes many thing about it.
> He thinks that now everything will change, but when he comes home there
> is still the same desk and still the chairs, the same bed, and he hates
> it.
>
> "Always the same desk", he says, "always the same chairs." And he starts
> wondering why a desk is called "desk" and the chairs called "chairs". He
> wonders why the bed isn't called "ball" and is amused by the thought.
> "Now everything is going to change!", and from now on he calls the bed
> "ball". "I am tired and want to go to ball."
>
> Next day he ponders what to call the other things in his room; his desk
> becomes "carpet", his chair "clock", his clock "photo album" and so on.
> So in the morning he lays in his ball and around nine the photo album
> rings. He is greatly amused and continues renaming things and replaces
> the verbs also. He now had a language all for himself.
>
> Eventually he started having trouble translating between his language
> and the language the people around him use, to a point where he became
> afraid to talk to them and started laughing when he heard others talk.
> He stopped talking.
>
> Meaningful communication requires that sender and recipient agree on the
> meaning of what is being exchanged, otherwise they will fail to communi-
> cate. Saying that the specification of a communication protocol is only
> for "servers" is the equivalent of the old man telling you what words he
> is using, but not telling you how you are to understand them. Since the
> old man and you are using the same words, you would learn nothing from
> that. A specification only for servers would at best make any sense at
> all if there already was a higher-level specification that actually de-
> fines the protocol, and the other specification is just a subset. This
> is not the case here. This specification defines "If you send this, you
> can expected it to be understood in this way"; "If you receive this, you
> should understand it in this way." It addresses senders and receivers
> alike. The idea that a communication protocol specification is only for
> "servers" is silly.
>
> What you rather want the specification to say is that there is a con-
> siderable amount of content that does work in major implementation in
> a specific way, but whose meaning is not defined by the specification,
> and that that is a severe problem implementers will be interested in
> addressing and be reasonably able to do so. So far you have provided
> no evidence of that being the case, and my rudimentary tests suggest
> you are mistaken.
>
> If you can come up with something that works in all of IE6, Firefox,
> and an independent third browser of your choosing, and demonstrate this
> issue comes up for let's say 1 in 10,000 downloads with a content-
> disposition header, where falling back to heuristics for the filename
> would have substantial impact on the user experience, then I'd be more
> than happy to talk about that. In my data I see only unquoted spaces
> that do not work across major browsers, and headers that specify only
> the filename (most of which are no problem because the filename they
> specify is redundant with the typical heuristics) that occur at a
> frequency worth mentioning. You are also more than welcome to propose
> alternate metrics alongside your evidence. Vague suggestions that the
> specification is inadequate are not helpful.
> --
> Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
> Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
> 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
>
Received on Sunday, 3 October 2010 19:20:49 UTC