Re: TICKET 259: 'treat as invalid' not defined

Date: Sun, 12 Dec 2010 15:36:33 -0800
Message-ID: <AANLkTi=85-HjPqk96ZihZG+Lvfdmso8fLbmfCVu8eSBB@mail.gmail.com>
To: Mark Nottingham <mnot@mnot.net>
Cc: Julian Reschke <julian.reschke@gmx.de>, httpbis <ietf-http-wg@w3.org>
On Sun, Dec 12, 2010 at 3:11 PM, Mark Nottingham <mnot@mnot.net> wrote:
> On 12/12/2010, at 10:03 AM, Adam Barth wrote:
>>>> Does this imply \-decoding?  We don't want to do \-decoding.
>>>
>>> Yes, that's implied by quoted-string.
>>
>> Ok, then that's not acceptable.  We don't want to do \-decoding.
>
> Generally in the IETF we act as individuals making technical arguments, not as representatives of companies that are negotiating product roadmaps.
>
> So, saying "we don't want to do \-decoding" isn't helpful. Can you please tell give us some technical reasoning for people to consider?

Sure.  I feel like I'm repeating myself, but I'll write it all here for clarity.

>From Julian's http://greenbytes.de/tech/tc2231/#attwithasciifnescapedchar test:

Content-Disposition: attachment; filename="f\oo.html"

FF3	fail (apparently does not treat the backslash as escape character,
replaces it with '_' (see Mozilla Bug 588389))
MSIE8	fail (apparently does not treat the backslash as escape
character, replaces it with '_')
Opera	pass
Safari	fail (apparently does not treat the backslash as escape
character, replaces it with '-')
Konq	pass
Chrome	fail (saves "oo.html" (skips the unescaping of "\", and then
mistakes it for a path separator, see Chrome Issue 52577))
Chrome9	fail (saves "oo.html" (skips the unescaping of "\", and then
mistakes it for a path separator, see Chrome Issue 52577))
Android	fail (apparently does not treat the backslash as escape
character, replaces it with '-')

The only browsers in his test that \-decode the filename parameter are
Opera and Konquerer.  The other browsers, representing some 99% of the
market, do not \-decode.

Based on this data lone, I'd be quite hesitant to implement \-decoding
in my hypothetical user agent.  Worse, the \ character is actually
quite commonly used in file paths because it is the path separator on
Windows.  It seems entirely likely that some number of servers send
absolute paths rather than file names,

Content-Disposition: attachment; filename="C:\foo\newsheet.html"

or even just relative paths, like "foo\bar.html".

Now, I haven't gone out and measured the prevalence of the \ character
in the headers on the web.  If some would like to run that experiment,
I'd certainly be open to considering that data.  However, in the
absence of such data, I think it's unlikely that browsers will change
their behavior.

Finally, I don't think we should require user agents to implement
behavior we have reason to believe will not be implemented by a number
of major implementations.  Every time we do that, we take one more
step down the path to irrelevance.  I'd like to see more engagement
between the browser vendors and the IETF, not less.

> Also, if you do say "we" please be very clear about its scope -- it's not clear if you're speaking informally about you and some friends, or you're formally representing Google, or the Chrome browser team, or...

I shouldn't have used the term "we."  I was just frustrated at Julian.