Re: Content-Disposition next steps from Mark Nottingham on 2010-12-02 (ietf-http-wg@w3.org from October to December 2010)

From: Mark Nottingham <mnot@mnot.net>
Date: Thu, 2 Dec 2010 11:54:40 +1100
To: Adam Barth <ietf@adambarth.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <09DCD570-9B34-4C22-A9A9-3FF8874F0434@mnot.net>
One personal comment -- I know we talked about making it as declarative as possible, but looking at this, I wonder if having a separate set of optional BNF really helps. Thoughts?


On 02/12/2010, at 11:51 AM, Adam Barth wrote:

> On Wed, Dec 1, 2010 at 4:41 PM, Mark Nottingham <mnot@mnot.net> wrote:
>> * We need to assure that it doesn't conflict with the rest of the C-D spec, adjusting either it or the spec as necessary, and documenting where we don't have interop. Based on the discussion so far with Julian and Bjoern, it seems that's under way.
> 
> Yep.
> 
>> * We need to get other UA vendors on board; just having it reflect Chrome's behaviour isn't productive. Many are on-list, but I'll ping those I know to make sure they're aware. Please talk to those you know and make sure they know it's important that we have their input / buy-in here.
> 
> That's wise.
> 
>> * One way or another, I'd like to get the C-D draft submitted for IETF LC by the holidays. If we can get this appendix hammered out by then, we can include it; if not, we can work on it as a separate document.
> 
> There don't seem to be that many technical differences.  Hopefully
> we'll be able to make that deadline.
> 
>> To help move us along, it might be good to get the draft text somewhere where it can be collaboratively edited and viewed. How about on the WG Wiki?
> 
> Done:
> http://trac.tools.ietf.org/wg/httpbis/trac/wiki/ContentDispositionErrorHandling
> 
> Adam
> 
> 
>> On 02/12/2010, at 6:50 AM, Adam Barth wrote:
>> 
>>> On Wed, Dec 1, 2010 at 3:12 AM, Mark Nottingham <mnot@mnot.net> wrote:
>>>> Adam, do you have a proposal?
>>> 
>>> Yeah.  Please find my proposal below.  It's certainly not beautiful,
>>> and it likely needs more polish, but it should be a starting point.
>>> 
>>> I tried to be as "gramatical" as I could, but couldn't quite figure
>>> out how avoid all the algorithmic aspects.  The proposal is based on
>>> what Chrome does, but cleaned up slightly.  There's some sadness I
>>> couldn't quite figure out how to avoid, but I'm certainly open to
>>> talking about it more.
>>> 
>>> The rules for determining the disposition-type are particularly goofy.
>>> I wanted to do more homework to figure how if we can make those more
>>> aesthetic, but I ran out of time.
>>> 
>>> One of the ground rules was that my proposal should only differ from
>>> the current draft in error-handling cases.  I believe that's the case,
>>> but I'm not 100% sure.  Please let me know if I've screwed that up.
>>> 
>>> Adam
>>> 
>>> 
>>> == Extracting Parameter Values From Header Fields ==
>>> 
>>> To extract the value for a given parameter-name from an unparsed-string, parse
>>> the unparsed-string using the following grammar:
>>> 
>>>  unparsed-string = *CHAR name *LWS "=" value [ ";" *CHAR ]
>>>  value           = <CHAR, except ";">
>>> 
>>> where the name production is a gramatical production that is a case-insensitive
>>> match for the given parameter-name.  If the unparsed-string can be parsed by
>>> the grammar in multple ways, choose the one in which name appears as close to
>>> the beginning of the string as possible.  If the unparsed-string cannot be
>>> parsed by the grammar above, return the empty string.
>>> 
>>> 
>>> == Decoding the File Name ==
>>> 
>>> To filename-decode an encoded-string, parse the encoded-string using the
>>> following grammar:
>>> 
>>>  encoded-string = word *( 1*delimiter word )
>>>  delimiter      = LWS
>>>  word           = <CHAR, except delimiter>
>>> 
>>> Consider each gramatical element (either a delimiter or a word) in the order
>>> they appear in the encoded-string:
>>> 
>>>  1) If the gramatical element is a delimiter, process the element as follows:
>>> 
>>>       a) If the previous gramatical element was an RFC2047-value, ignore this
>>>          gramatical element.
>>> 
>>>       b) Otherwise, emit a SP character.
>>> 
>>>  2) If the gramatical element is a word, process the element as follows:
>>> 
>>>       a) If the word contains non-ASCII characters, process the element as
>>>          follows:
>>> 
>>>            i)  If the word is a well-formed UTF-8 string, emit the word
>>>                (decoded as UTF-8) and proceed to the next grammatical element.
>>> 
>>>            ii) Otherwise, *sadness*.  Apparently what we're supposed to do
>>>                here is to use the "referrer" charset, if we have one.
>>>                Otherwse, we fall back to the OS codepage.
>>> 
>>>        b) If the word is an RFC2047-value, emit the RFC2047 decoding of the
>>>           word and proceed to the next grammatical element.
>>> 
>>>        c) Let the url-unescaped-word be the word %-unescaped.
>>> 
>>>        d) Emit the url-unescaped-word (decoded as UTF-8) and proceed to the
>>>           next grammatical element.  (There's actually more sadness here if
>>>           the url-unescaped-word isn't valid UTF-8.)
>>> 
>>> The emitted characters are the decoded file name.
>>> 
>>> 
>>> == Determining the File Name ==
>>> 
>>> To determine the file name indicated by a Content-Disposition header field, use
>>> the following algorithm:
>>> 
>>>  1) Let filename-star be the value extracted from the Content-Disposition
>>>     header field for for the "filename*" parameter.
>>> 
>>>  2) If filename-star parses as a RFC5987-value, return the RFC5987-value of
>>>     filename-star and abort these steps.
>>> 
>>>  3) Let filename be the value extracted from the Content-Disposition header
>>>     field for the "filename" parameter.
>>> 
>>>  4) If filename is empty, instead let filename be the value extracted from the
>>>     Content-Disposition header field for the "name" parameter.
>>> 
>>>  5) If filename is empty, return the empty string and abort these steps.
>>> 
>>>  6) Return the filename-decoding of filename.
>>> 
>>> 
>>> == Determining the Disposition ==
>>> 
>>> To determine the disposition-type, parse the Content-Disposition
>>> header field using
>>> the following grammar:
>>> 
>>>  unparsed-string  = *LWS nominal-type *CHAR
>>>  nominal-type = "inline" / "filename" / "name" / ";"
>>> 
>>> If the Content-Disposition header field parser fails to parse, then the
>>> disposition type is "attachment".  Otherwise, the disposition-type is "inline".
>>> 
>>> 
>>> == Processing the Content-Disposition Header Field ==
>>> 
>>> To process the Content-Disposition header field, use the following algorithm:
>>> 
>>>  1) Determine the disposition-type.
>>> 
>>>  2) If the disposition-type is "inline", then ...
>>> 
>>>  3) If the disposition-type is "attachment", then let filename be the file
>>>     name indicated by the header field.  ...
>> 
>> --
>> Mark Nottingham   http://www.mnot.net/
>> 
>> 
>> 
>> 

--
Mark Nottingham   http://www.mnot.net/
Received on Thursday, 2 December 2010 00:55:13 UTC