Re: Content-Disposition next steps

Thanks for your feedback Bjoern.  Responses inline.  (Note I've
trimmed a bunch of non-technical content to try to focus on the
technical issues.)

On Wed, Dec 1, 2010 at 2:32 PM, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:
> * Adam Barth wrote:
>>== Determining the Disposition ==
>>To determine the disposition-type, parse the Content-Disposition
>>header field using
>>the following grammar:
>>
>>  unparsed-string  = *LWS nominal-type *CHAR
>>  nominal-type = "inline" / "filename" / "name" / ";"
>>
>>If the Content-Disposition header field parser fails to parse, then the
>>disposition type is "attachment".  Otherwise, the disposition-type is "inline".
>
> It is incorrect to specify the *LWS here as the draft uses RFC 2616
> implied LWS rules.

Forgive me.  It's difficult to for me to keep track of which documents
use implied LWS.  Please read this text as if it were in a grammar
without implied LWS.

> This also does not do what you intend as *CHAR
> does not match what you think it matches.

Thanks.  I've replaced all instances of CHAR with OCTET.

> It also mishandles a common
> case, the empty string as value, which fails to parse but is handled
> pretty much universally as "inline", contrary to your proposal.

Fixed.

> Not to mention that it's silly to treat `x=y; filename=example.txt` as if
> it had an unrecognized disposition type and should thus be handled as
> "attachment", which, say, Internet Explorer and Opera don't do, when
> you treat plain `filename=example.txt` as having no disposition type.

Perhaps Julian would be willing to add this case to his test suite?
Silliness isn't one of the criteria I've applied.

> I can't really make heads or tails of the rest of your proposal, for
> instance, if you go by the processing rules already in the draft, you
> would not need to discuss quote marks, but you seem to have your own
> rules for processing parameters and parameter values, in which case
> you would need to discuss quote marks, but your proposal does not.

It's possible I've screwed up handling quote marks.  Do you have a
specific test case you're worried about?  I was surprised as well that
I didn't need to mention quote marks.

>>== Extracting Parameter Values From Header Fields ==
>>
>>To extract the value for a given parameter-name from an unparsed-string, parse
>>the unparsed-string using the following grammar:
>>
>>  unparsed-string = *CHAR name *LWS "=" value [ ";" *CHAR ]
>>  value           = <CHAR, except ";">
>>
>>where the name production is a gramatical production that is a case-insensitive
>>match for the given parameter-name.  If the unparsed-string can be parsed by
>>the grammar in multple ways, choose the one in which name appears as close to
>>the beginning of the string as possible.  If the unparsed-string cannot be
>>parsed by the grammar above, return the empty string.
>
> This does not work as you intend as CHAR is US-ASCII 0x00 through 0x7F
> so you would never get to filename parameter values that are not UTF-8.

Fixed.

> Not to mention that this is utterly silly, if you have "x<name>=..."
> this would be handled as if the value had a `name` parameter with the
> empty string as value, as opposed to the semantically correct result,
> which would be "there is no `name` parameter".

Perhaps this is another good test case to add to the suite.  I'm
willing to believe this behavior isn't necessary, but I'd like to look
at some more evidence before changing it.

> And as you lack higher
> level control logic to actually separate parameters, this results in
> `example="filename=example.txt"` having the filename `example.txt"`,
> as opposed to the correct result, namely that there is no filename.

Sounds like another good test case.

Adam

Received on Wednesday, 1 December 2010 22:54:59 UTC