Re: TICKET 259: 'treat as invalid' not defined from Julian Reschke on 2010-12-13 (ietf-http-wg@w3.org from October to December 2010)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Mon, 13 Dec 2010 09:36:48 +0100
To: Adam Barth <ietf@adambarth.com>
CC: Mark Nottingham <mnot@mnot.net>, httpbis <ietf-http-wg@w3.org>
Message-ID: <4D05DB20.3030009@gmx.de>

On 13.12.2010 00:36, Adam Barth wrote:
> ...
> The only browsers in his test that \-decode the filename parameter are
> Opera and Konquerer.  The other browsers, representing some 99% of the
> market, do not \-decode.
>
> Based on this data lone, I'd be quite hesitant to implement \-decoding
> in my hypothetical user agent.  Worse, the \ character is actually
> quite commonly used in file paths because it is the path separator on
> Windows.  It seems entirely likely that some number of servers send
> absolute paths rather than file names,
>
> Content-Disposition: attachment; filename="C:\foo\newsheet.html"
>
> or even just relative paths, like "foo\bar.html".
>
> Now, I haven't gone out and measured the prevalence of the \ character
> in the headers on the web.  If some would like to run that experiment,
> I'd certainly be open to considering that data.  However, in the
> absence of such data, I think it's unlikely that browsers will change
> their behavior.
> ...

We are going in circles, and have been for months now.

We discussed this just two 10 days ago. See 
<http://lists.w3.org/Archives/Public/ietf-http-wg/2010OctDec/0536.html>, 
where I said:

-- snip --
"Fixing" means "changing things to work as specified".

So the question here is whether it would break things because there are
servers sending unescaped backslashes. As far as I can tell, sending
path separators in the filename indicates a bug in the sender, or an
attempt to trick the user agent to do something it's not supposed to do.

So the "harm" of actually doing the unescaping would be that for a
filename that needs to be postprocessed anyway, the problematic
character would be filtered in a different way.

Starting with

    filename="a\bc"

the broken implementation sees "a" and "bc" separated by a path
separator, and will prost-process this to "abc", "a_bc" or "bc" (where _ 
  could be a different replacement character).

A correct implementation sees "abc".

I don't think there's a problem here.
-- snip --

You did reply to that in 
<http://lists.w3.org/Archives/Public/ietf-http-wg/2010OctDec/0538.html> with

"None of the user agents do \-decoding.  I don't see any value in them 
starting."

...which I quite frankly didn't find helpful. It appears that we have 
differing opinions on whether it's useful to have consistent handling of 
specific syntactical constructs.

I believe having different parsers for quoted-string depending on the 
context they appear in is both a bad idea in general, and also not 
*needed* here.

> Finally, I don't think we should require user agents to implement
> behavior we have reason to believe will not be implemented by a number
> of major implementations.  Every time we do that, we take one more

On the other hand, we also shouldn't require user agents to break the 
spec that has been around for ~10 years when there's no compelling 
reason to do so.

> step down the path to irrelevance.  I'd like to see more engagement
> between the browser vendors and the IETF, not less.

Yes, so do I. That's why I'd like to hear from the other browser makers 
(not necessarily vendors), and also observe how far we'll get with 
fixing the C-D bugs in Firefox once they are past the FF4 release.

> ...

Best regards, Julian

Received on Monday, 13 December 2010 08:37:28 UTC