Re: TICKET 259: 'treat as invalid' not defined

On 11.12.2010 20:42, Adam Barth wrote:
> Thanks for making a counter-proposal.  A few notes below.

(where the initial version has been around for over a month)

>> D.2.  Parsing for Disposition Type and Parameters
>>
>>    Using the simplified grammar below:
>>
>>      field-value = disp-type *( ";" param )
>>      disp-type   = token
>>      param       = token "=" value
>>
>>    ...parse the field value into a disp-type (disposition type) and a
>>    sequence of parameters (pairs of name (token) and value).  Lower-case
>>    all disposition types and parameter names.
>>
>>    If the field value does not conform to the grammar (such as when not
>>    exactly one disposition type is specified), ignore the whole header
>>    field.
>
> This doesn't cover cases like the following:
>
> Content-Disposition: attachment; inline; filename=foo.exe

Yes, this proposal is strictly about parsing valid headers for now.

> We want to treat those as an attachment.  Another grammer we could use
> might be the following:
>
>       field-value = item *( ";" item )
>       item          = disp-type / param
>       disp-type   =<OCTET, except ";" and "=">
>        param       = param-name "=" param-value
>       param-name =<OCTET, except "=">
>       param-value =<OCTET, except ";">
>
> We could then say that first disp-type and the first param are the
> ones that matter.  (I'm not sure this grammar handles<">  correctly,
> but I'm sure we can sort that out.)

If you did that, you'd be inconsistent with IE8: 
<http://localhost:8080/tc2231/#attandinline>.

>> D.3.  Checking Cardinality Constraints
>>
>>    If the parameter sequence contains multiple instances of the same
>>    parameter name, ignore the whole header field.
>
> We'd prefer to use the first one rather than ignore the header field.

<http://localhost:8080/tc2231/#attwith2filenames>

Most UAs do indeed pick the first one, but it would be useful to 
understand whether this is purely academic or not. Can you provide any 
evidence about happening this in practice?

>> D.4.  Post-Process Parameter Values
>>
>>    For each parameter, post-process the associated value part according
>>    to the grammar:
>>
>>    o  According to Section 3.2.1 of [RFC5987] for parameters using the
>>       RFC 5987 syntax (such as "filename*").  If this fails, just ignore
>>       this parameter.
>>
>>    o  According to the grammar for quoted-string (Section 2.2 of
>>       [RFC2616]) for values starting with a double quote character (").
>
> Does this imply \-decoding?  We don't want to do \-decoding.

Yes, that's implied by quoted-string.

>>    o  Verbatim otherwise.
>
> We'd like to do %-decoding both for the quoted and unquoted cases.

I realize that (we have a separate issue for that, I believe).

>>    Note that this step starts with an octet sequence obtained from the
>>    HTTP message, and results in a sequence of Unicode characters.
>
> Somewhere we want to say what character set we're using.

Indeed. Will fix.

>> D.5.  Extracting the Disposition Type
>>
>>    The parsing step (Appendix D.2) has returned the disposition type (to
>>    be matched case-insensitively), which can be "attachment", "inline",
>>    or an extension type.  If the type is unknown, treat it like
>>    "attachment" (see Section 3.2).
>
> What if there's no disposition type?
>
> Content-Disposition: filename=foo.exe
> Content-Disposition: foo=bar
>
> If I remember correctly, we're supposed to treat the former as inline
> and the later as attachment.

Dunno what you mean by "we're supposed to".

It SHOULD be handled like

Content-Disposition: filename=foo.exe, foo=bar

which is invalid. This needs a test case.

> ...

Best regards, Julian

Received on Saturday, 11 December 2010 22:47:42 UTC