# Re: TICKET 259: 'treat as invalid' not defined

From: Julian Reschke <julian.reschke@gmx.de>
Date: Mon, 13 Dec 2010 09:36:48 +0100
Message-ID: <4D05DB20.3030009@gmx.de>
CC: Mark Nottingham <mnot@mnot.net>, httpbis <ietf-http-wg@w3.org>
On 13.12.2010 00:36, Adam Barth wrote:
> ...
> The only browsers in his test that \-decode the filename parameter are
> Opera and Konquerer.  The other browsers, representing some 99% of the
> market, do not \-decode.
>
> Based on this data lone, I'd be quite hesitant to implement \-decoding
> in my hypothetical user agent.  Worse, the \ character is actually
> quite commonly used in file paths because it is the path separator on
> Windows.  It seems entirely likely that some number of servers send
> absolute paths rather than file names,
>
> Content-Disposition: attachment; filename="C:\foo\newsheet.html"
>
> or even just relative paths, like "foo\bar.html".
>
> Now, I haven't gone out and measured the prevalence of the \ character
> in the headers on the web.  If some would like to run that experiment,
> I'd certainly be open to considering that data.  However, in the
> absence of such data, I think it's unlikely that browsers will change
> their behavior.
> ...

We are going in circles, and have been for months now.

We discussed this just two 10 days ago. See
<http://lists.w3.org/Archives/Public/ietf-http-wg/2010OctDec/0536.html>,
where I said:

-- snip --
"Fixing" means "changing things to work as specified".

So the question here is whether it would break things because there are
servers sending unescaped backslashes. As far as I can tell, sending
path separators in the filename indicates a bug in the sender, or an
attempt to trick the user agent to do something it's not supposed to do.

So the "harm" of actually doing the unescaping would be that for a
filename that needs to be postprocessed anyway, the problematic
character would be filtered in a different way.

Starting with

filename="a\bc"

the broken implementation sees "a" and "bc" separated by a path
separator, and will prost-process this to "abc", "a_bc" or "bc" (where _
could be a different replacement character).

A correct implementation sees "abc".

I don't think there's a problem here.
-- snip --

You did reply to that in
<http://lists.w3.org/Archives/Public/ietf-http-wg/2010OctDec/0538.html> with

"None of the user agents do \-decoding.  I don't see any value in them
starting."

...which I quite frankly didn't find helpful. It appears that we have
differing opinions on whether it's useful to have consistent handling of
specific syntactical constructs.

I believe having different parsers for quoted-string depending on the
context they appear in is both a bad idea in general, and also not
*needed* here.

> Finally, I don't think we should require user agents to implement
> behavior we have reason to believe will not be implemented by a number
> of major implementations.  Every time we do that, we take one more

On the other hand, we also shouldn't require user agents to break the
spec that has been around for ~10 years when there's no compelling
reason to do so.

> step down the path to irrelevance.  I'd like to see more engagement
> between the browser vendors and the IETF, not less.

Yes, so do I. That's why I'd like to hear from the other browser makers
(not necessarily vendors), and also observe how far we'll get with
fixing the C-D bugs in Firefox once they are past the FF4 release.

> ...

Best regards, Julian

Received on Monday, 13 December 2010 08:37:28 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 11:10:55 UTC