Re: The robustness principle, as view by user agent implementors (Re: Working Group Last Call: draft-ietf-httpbis-content-disp-02) from Julian Reschke on 2010-10-03 (ietf-http-wg@w3.org from October to December 2010)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 03 Oct 2010 22:47:33 +0200
To: Adam Barth <w3c@adambarth.com>
CC: Bjoern Hoehrmann <derhoermi@gmx.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <4CA8EBE5.20908@gmx.de>

On 03.10.2010 22:30, Adam Barth wrote:
> ...
> Concretely, my proposal is that the specification should not forbid
> user agents from %-decoding the value of the filename parameter.
> Julian has agreed that neither Internet Explorer nor Chrome is going
> to stop %-decoding the filename parameter anytime soon.  Forbidding
> user agent from processing these messages in this way is fiction.
> ...

Implementing %-unescaping here is a bug.

I agree it's unlikely to disappear from IE and Chrome anytime soon, so a 
warning is probably a good idea.

I do not believe we can do more than that, because

(a) the 4 other UAs tested do the right thing, and

(b) there's no other way to deliberately send filenames containing these 
character sequences.

> Now, I'm fine with forbidding servers from generating %-encoded
> values.  In fact, I believe that would be desirable.  However, just
> because we forbid servers from generating the values does not mean
> that we must also forbid user agents from consuming them.

I'd be more than happy recommending something else, if there was a 
"something else" we can recommend.

> ...
> The problem is your change proposal focuses on uncovering the set of
> messages with the same meaning in all semantics.  While that is of
> interest to servers (generators), that's not of particular interest to
> user agent implementors, as explained before.  Instead, user agent
> implementors are interested in the largest set of messages generated
> by at least one server that can be interpreted with a single semantic
> theory.
> ...

I'm sure they are interested in that. But I bet they are also interested in

- unambiguous parsing,

- consistency,

- code reuse (between header fields),

- and compatibility with other UAs.

C-D is a mess, we can all agree on this. RFC 5987 and this ID are an 
attempt to make things better. My main goal is to get to a situation 
where servers can emit C-D headers with filenames containing "arbitrary" 
characters from the Unicode character set (minus whitespace, control, 
path delims, whatnot) without having to do UA sniffing.

To get there, we need IE/Chrome/Safari to implement RFC2231/5987, or 
invent something new everybody is willing to accept. So far I haven't 
seen a serious alternative proposal.

Once everybody implements RFC 5987, broken support for "filename" just 
doesn't matter anymore, because you can always can use the new encoding 
("filename*"). But unfortunately we aren't there yet.

Best regards, Julian

Received on Sunday, 3 October 2010 20:48:08 UTC