W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2010

Re: TICKET 259: 'treat as invalid' not defined

From: Adam Barth <ietf@adambarth.com>
Date: Sat, 11 Dec 2010 15:03:59 -0800
Message-ID: <AANLkTinYTiM-Y7k1NCrtiLF53UvXWa0__4ma75F4E-RN@mail.gmail.com>
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Mark Nottingham <mnot@mnot.net>, httpbis <ietf-http-wg@w3.org>
Your web site seems to be down, so I'm working from memory:

http://greenbytes.de/tech/tc2231/

HTTP/1.1 200 OK
Date: Sat, 11 Dec 2010 22:54:51 GMT
Server: Apache/2.2.14 (Ubuntu)
Last-Modified: Thu, 09 Dec 2010 13:51:48 GMT
ETag: "11002c-0-496fa89fea500"
Accept-Ranges: bytes
Content-Length: 0
Content-Type: text/html

On Sat, Dec 11, 2010 at 2:46 PM, Julian Reschke <julian.reschke@gmx.de> wrote:
> On 11.12.2010 20:42, Adam Barth wrote:
>>> D.2.  Parsing for Disposition Type and Parameters
>>>
>>>   Using the simplified grammar below:
>>>
>>>     field-value = disp-type *( ";" param )
>>>     disp-type   = token
>>>     param       = token "=" value
>>>
>>>   ...parse the field value into a disp-type (disposition type) and a
>>>   sequence of parameters (pairs of name (token) and value).  Lower-case
>>>   all disposition types and parameter names.
>>>
>>>   If the field value does not conform to the grammar (such as when not
>>>   exactly one disposition type is specified), ignore the whole header
>>>   field.
>>
>> This doesn't cover cases like the following:
>>
>> Content-Disposition: attachment; inline; filename=foo.exe
>
> Yes, this proposal is strictly about parsing valid headers for now.

Ok.  We're explicitly interested in handling invalid headers.

>> We want to treat those as an attachment.  Another grammer we could use
>> might be the following:
>>
>>      field-value = item *( ";" item )
>>      item          = disp-type / param
>>      disp-type   =<OCTET, except ";" and "=">
>>       param       = param-name "=" param-value
>>      param-name =<OCTET, except "=">
>>      param-value =<OCTET, except ";">
>>
>> We could then say that first disp-type and the first param are the
>> ones that matter.  (I'm not sure this grammar handles<">  correctly,
>> but I'm sure we can sort that out.)
>
> If you did that, you'd be inconsistent with IE8:
> <http://localhost:8080/tc2231/#attandinline>.

Indeed.  Agreement between all the browsers isn't required to make progress.

>>> D.3.  Checking Cardinality Constraints
>>>
>>>   If the parameter sequence contains multiple instances of the same
>>>   parameter name, ignore the whole header field.
>>
>> We'd prefer to use the first one rather than ignore the header field.
>
> <http://localhost:8080/tc2231/#attwith2filenames>
>
> Most UAs do indeed pick the first one, but it would be useful to understand
> whether this is purely academic or not. Can you provide any evidence about
> happening this in practice?

I don't have any data to present at this time.  However, we still want
to define how to handle these cases.  If it turns out not to affect
any web sites, that's fine.

>>> D.4.  Post-Process Parameter Values
>>>
>>>   For each parameter, post-process the associated value part according
>>>   to the grammar:
>>>
>>>   o  According to Section 3.2.1 of [RFC5987] for parameters using the
>>>      RFC 5987 syntax (such as "filename*").  If this fails, just ignore
>>>      this parameter.
>>>
>>>   o  According to the grammar for quoted-string (Section 2.2 of
>>>      [RFC2616]) for values starting with a double quote character (").
>>
>> Does this imply \-decoding?  We don't want to do \-decoding.
>
> Yes, that's implied by quoted-string.

Ok, then that's not acceptable.  We don't want to do \-decoding.

>>>   o  Verbatim otherwise.
>>
>> We'd like to do %-decoding both for the quoted and unquoted cases.
>
> I realize that (we have a separate issue for that, I believe).
>
>>>   Note that this step starts with an octet sequence obtained from the
>>>   HTTP message, and results in a sequence of Unicode characters.
>>
>> Somewhere we want to say what character set we're using.
>
> Indeed. Will fix.
>
>>> D.5.  Extracting the Disposition Type
>>>
>>>   The parsing step (Appendix D.2) has returned the disposition type (to
>>>   be matched case-insensitively), which can be "attachment", "inline",
>>>   or an extension type.  If the type is unknown, treat it like
>>>   "attachment" (see Section 3.2).
>>
>> What if there's no disposition type?
>>
>> Content-Disposition: filename=foo.exe
>> Content-Disposition: foo=bar
>>
>> If I remember correctly, we're supposed to treat the former as inline
>> and the later as attachment.
>
> Dunno what you mean by "we're supposed to".
>
> It SHOULD be handled like
>
> Content-Disposition: filename=foo.exe, foo=bar
>
> which is invalid. This needs a test case.

I agree that it's invalid.  However, we want to define how to handle
invalid headers.

Adam
Received on Saturday, 11 December 2010 23:05:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:51:33 GMT