W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2011

Re: \-decoding filename parameters [general issue now #270]

From: Mark Nottingham <mnot@mnot.net>
Date: Fri, 4 Feb 2011 16:01:57 +1100
Cc: Adam Barth <ietf@adambarth.com>, httpbis <ietf-http-wg@w3.org>
Message-Id: <A4AF6137-2C83-400C-9433-A194BD650445@mnot.net>
To: Julian Reschke <julian.reschke@gmx.de>

On 04/02/2011, at 5:52 AM, Julian Reschke wrote:
> 
> That has bugged me for some time, and I wasn't sure whether I was just too pedantic and the answer is obvious. We should treat this as something we need to fix.
> 
> That being said: the spec specifies quoted-pair in a way that other characters are allowed. If we don't change that, we need to say what it means to use this.

Agreed. 

I think we need to at least consider changing it, because escaping them doesn't serve any useful purpose, and it's not well-specified or well-implemented. 

I've created <http://trac.tools.ietf.org/wg/httpbis/trac/ticket/270> to track it.

> The grammar in HTTP was inspired by RFC 822 and successors, and RFC 5322 says:
> 
> "Where any quoted-pair appears, it is to be interpreted as the character alone. That is to say, the "\" character that appears as part of a quoted-pair is semantically "invisible"." -- <http://greenbytes.de/tech/webdav/rfc5322.html#rfc.section.3.2.1>
> 
> This happens to match my intuitive understanding what escape characters are for, so I'd propose that we adopt that.

How widely supported is it in MIME implementations? 

We've already strayed significantly from MIME (e.g., line folding), so I don't think we should be constrained by it (although of course, we shouldn't needlessly stray).


>> At the least, then, we should continue to discourage the use of \-escaping for things other than "\" and<">.
> 
> Recommending not to escape things that do not need escaping is fine. I'm not sure whether this needs to be a SHOULD.

See above; it's not widely supported, and certainly not interoperably. Why wouldn't we steer senders away from it?


>> If we were to write error-handling advice for it, it seems that we could give *weak* advice to replace with "_" or "-" (based upon<http://greenbytes.de/tech/tc2231/#attwithasciifnescapedchar>). We should probably also consider that question for BIS, but that can wait for now.
> 
> Disagreed.
> 
> First of all, this is only error handling if we actually forbid those sequences. We don't do that right now.
> 
> It would be bad if we ended up with different rules for processing quoted-string depending on where they occur, just to able that some broken implementations can claim that they are not.

It's pretty strong to call an implementation 'broken' because it doesn't implement something that is poorly specified and not useful. 

I agree, though, that it would be good to have a single answer for all of bis, rather than special-casing it.


>> To me the currently relevant question is whether implementers will eventually support escaping "\" and<">. Right now a few do (see<http://greenbytes.de/tech/tc2231/#attwithasciifnescapedquote>), but many don't. However, since this is a "soft" failure / interop problem (i.e., it affects how a file is named when saved on disk, but doesn't prevent it from being saved or named), I don't see that as a reason to not specify it.
> 
> Sending a filename with a literal backslash character in it is likely an attempt by the sender to trick the recipient to overwrite files in another directory. The spec already recommends:
> 
> "When the value contains path separator characters, all but the last segment SHOULD be ignored. This prevents unintentional overwriting of well-known file system location (such as "/etc/passwd")." -- <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-content-disp-04.html#rfc.section.3.3>
> 
> So it really doesn't matter a lot at what stage the \ disappears.

Your argument assumes that \ is recognised as a path separator; on some platforms, it's not. 


> Escaped DQUOTEs (<http://greenbytes.de/tech/tc2231/#attwithasciifnescapedquote>) cover a use case that has no other solution (except for 5987-encoding everything). So I believe the right thing to do is to keep this specified, and potentially warn servers about the UAs that can't handle it (similar to the way we already warn about the "%" problem).


Agreed.

--
Mark Nottingham   http://www.mnot.net/
Received on Friday, 4 February 2011 05:02:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:51:36 GMT