W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2011

Re: \-decoding filename parameters [general issue now #270]

From: Julian Reschke <julian.reschke@gmx.de>
Date: Fri, 04 Feb 2011 09:17:00 +0100
Message-ID: <4D4BB5FC.2000603@gmx.de>
To: Mark Nottingham <mnot@mnot.net>
CC: Adam Barth <ietf@adambarth.com>, httpbis <ietf-http-wg@w3.org>
On 04.02.2011 06:01, Mark Nottingham wrote:
>
> On 04/02/2011, at 5:52 AM, Julian Reschke wrote:
>>
>> That has bugged me for some time, and I wasn't sure whether I was just too pedantic and the answer is obvious. We should treat this as something we need to fix.
>>
>> That being said: the spec specifies quoted-pair in a way that other characters are allowed. If we don't change that, we need to say what it means to use this.
>
> Agreed.
>
> I think we need to at least consider changing it, because escaping them doesn't serve any useful purpose, and it's not well-specified or well-implemented.
>
> I've created<http://trac.tools.ietf.org/wg/httpbis/trac/ticket/270>  to track it.
>
>> The grammar in HTTP was inspired by RFC 822 and successors, and RFC 5322 says:
>>
>> "Where any quoted-pair appears, it is to be interpreted as the character alone. That is to say, the "\" character that appears as part of a quoted-pair is semantically "invisible"." --<http://greenbytes.de/tech/webdav/rfc5322.html#rfc.section.3.2.1>
>>
>> This happens to match my intuitive understanding what escape characters are for, so I'd propose that we adopt that.
>
> How widely supported is it in MIME implementations?
>
> We've already strayed significantly from MIME (e.g., line folding), so I don't think we should be constrained by it (although of course, we shouldn't needlessly stray).

As I said before, I believe that the MIME rule describes common sense in 
escaping, and the fact that it wasn't stated in 2068/2616 was just some 
oversight.

What we should do is go through the spec, check for occurrences of 
quoted-string, and see what implementations do (and with that I do not 
only mean browsers). Yes, this will be time-consuming.

>>> At the least, then, we should continue to discourage the use of \-escaping for things other than "\" and<">.
>>
>> Recommending not to escape things that do not need escaping is fine. I'm not sure whether this needs to be a SHOULD.
>
> See above; it's not widely supported, and certainly not interoperably. Why wouldn't we steer senders away from it?

I didn't say we shouldn't. I was questioning whether a RFC2119-SHOULD is 
the right way to do it.

>>> If we were to write error-handling advice for it, it seems that we could give *weak* advice to replace with "_" or "-" (based upon<http://greenbytes.de/tech/tc2231/#attwithasciifnescapedchar>). We should probably also consider that question for BIS, but that can wait for now.
>>
>> Disagreed.
>>
>> First of all, this is only error handling if we actually forbid those sequences. We don't do that right now.
>>
>> It would be bad if we ended up with different rules for processing quoted-string depending on where they occur, just to able that some broken implementations can claim that they are not.
>
> It's pretty strong to call an implementation 'broken' because it doesn't implement something that is poorly specified and not useful.
>
> I agree, though, that it would be good to have a single answer for all of bis, rather than special-casing it.
>
>
>>> To me the currently relevant question is whether implementers will eventually support escaping "\" and<">. Right now a few do (see<http://greenbytes.de/tech/tc2231/#attwithasciifnescapedquote>), but many don't. However, since this is a "soft" failure / interop problem (i.e., it affects how a file is named when saved on disk, but doesn't prevent it from being saved or named), I don't see that as a reason to not specify it.
>>
>> Sending a filename with a literal backslash character in it is likely an attempt by the sender to trick the recipient to overwrite files in another directory. The spec already recommends:
>>
>> "When the value contains path separator characters, all but the last segment SHOULD be ignored. This prevents unintentional overwriting of well-known file system location (such as "/etc/passwd")." --<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-content-disp-04.html#rfc.section.3.3>
>>
>> So it really doesn't matter a lot at what stage the \ disappears.
>
> Your argument assumes that \ is recognised as a path separator; on some platforms, it's not.

I'm not talking about OS behavior but UA behavior. For the purpose of 
Content-Disposition/filename, I would *hope* that UAs treat \ and / the 
same. My tests are <http://greenbytes.de/tech/tc2231/#attabspath> and 
<http://greenbytes.de/tech/tc2231/#attabspathwin> and my results reflect 
the Windows versions of UAs; it would be nice if somebody could try 
whether Safari/Chrome/Opera behave differently on MacOS...

>> Escaped DQUOTEs (<http://greenbytes.de/tech/tc2231/#attwithasciifnescapedquote>) cover a use case that has no other solution (except for 5987-encoding everything). So I believe the right thing to do is to keep this specified, and potentially warn servers about the UAs that can't handle it (similar to the way we already warn about the "%" problem).
>
>
> Agreed.
>
> --
> Mark Nottingham   http://www.mnot.net/

Best regards, Julian
Received on Friday, 4 February 2011 08:17:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:51:36 GMT