Re: Working Group Last Call: draft-ietf-httpbis-content-disp-02 from Julian Reschke on 2010-10-03 (ietf-http-wg@w3.org from October to December 2010)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 03 Oct 2010 16:01:48 +0200
To: Adam Barth <w3c@adambarth.com>
CC: Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <4CA88CCC.3010107@gmx.de>

On 03.10.2010 02:23, Adam Barth wrote:
> If you'd like some (BSD licensed) examples of that a recent browser
> implementation of the Content-Disposition header thinks are important,
> you can look at the the test cases here (scroll down to
> "content-disposition"):
>
> http://src.chromium.org/viewvc/chrome/trunk/src/net/base/net_util_unittest.cc?revision=60555&view=markup
>
> Here are some examples of things that look to be important but don't
> seem to be covered in the document:
>
> 1) It looks like the disposition-type is actually optional, even
> though it's required by the grammar.  It seems entirely possible that
> the optimal generative grammar is to require the disposition-type
> whereas the optimal consumption grammar is to default to some
> particular disposition-type if none is provided.

<http://greenbytes.de/tech/tc2231/#attmissingdisposition>

My tests show that the observable behavior is that the header is ignored 
(which is good).

> 2) It looks like the user agent is supposed to URL-decode the filename:
>
> Content-Disposition: inline; filename="abc%20de.pdf"
> =>  abc de.pdf

Well, it's not supposed to from a standards point of view, nor from an 
interop point of view.

> Appendix C.4 seems to indicate that that this is implemented by IE and
> Chrome.  From the comments in the file referenced above, it seems this
> is important for the Asian market.  Both according to Appendix C.2 and
> the comments in the file referenced above, it looks like the encoding
> for the unescaped is supposed to be based on the current code page,
> which admitted sucks.

It sucks totally, because it's impossible to be used in practice for an 
international audience.

The support in Chrome (which I complained about the day it came out, see 
<http://code.google.com/p/chromium/issues/detail?id=118>) might be 
well-intentioned, but will not always work as intended: long before 
Chrome and Safari where available, server implementers had to choose 
between the IE "encoding" and the RFC2231-encoding (supported by Firefox 
and Opera), and at least in some cases, the IE encoding would only be 
sent based on the User-Agent string (believe me, I had to write one of 
these). Remember -- until two years ago, only IE supported this broken 
encoding; thus Chrome made the situation worse by copying IE's broken 
behavior, instead of Firefox' and Opera's.

> Again, the optimal generative grammar is probably to avoid this mess.
> However, it seems entirely likely the optimal consumption grammar is
> to do something with percent-encoded inputs, especially if you care
> about pleasing customers in the Asian market.

If you really care about pleasing Asian customers, please convince more 
UA implementers to support the standardized encoding, which does not 
depend on the client's locale, and also does not conflict with other 
legitimate uses of the filename parameter.

> 3) Apparently whitespace in the filename are supposed to be converted to spaces:
>
> Content-Disposition: inline; filename="abc<TAB><NEWLINE>de.pdf"
> =>  abc    de.pdf

I would consider this as part of post-processing something as filename, 
not the actual parsing. There are more things to worry about (stripping 
path separators, making things relative, avoiding trailing and leading 
whitespace, assigning a sane extension, and so on).

This is covered at the end of Section 3.3 
(<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-content-disp-02.html#rfc.section.3.3>), 
but I would be surprised if there wasn't more that should be mentioned.

Best regards, Julian

Received on Sunday, 3 October 2010 14:02:24 UTC