- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Fri, 10 Dec 2010 10:59:20 +0100
- To: Mark Nottingham <mnot@mnot.net>
- CC: Adam Barth <ietf@adambarth.com>, httpbis <ietf-http-wg@w3.org>
On 12.11.2010 08:53, Julian Reschke wrote:
> On 12.11.2010 05:58, Mark Nottingham wrote:
>> I'm confused. I thought that we were going to talk about error
>> handling in an appendix, but it appears you're starting to talk about
>> it here.
>
> 1) Yes, it should be an appendix.
>
> 2) Well, it's parsing advice. It appears that some readers have trouble
> understanding how to derive a parsing strategy from the way how we
> currently write specs, so this is an attempt to describe just that.
> ...
Here's an updated proposal (see also
<http://trac.tools.ietf.org/wg/httpbis/trac/attachment/ticket/259/i259.diff>):
-- snip --Appendix D. Parsing
This document does not require any specific handling of invalid
header field values. With this in mind, the text below describes a
simple strategy for parsing the header field and detecting problems
in general, or in specific parameters.
D.1. Combine Multiple Instances of Content-Disposition
If the HTTP message contains multiple instances of the Content-
Disposition header field, combine all field values into a single one
as specified in Section 4.2 of [RFC2616].
D.2. Parsing for Disposition Type and Parameters
Using the simplified grammar below:
field-value = disp-type *( ";" param )
disp-type = token
param = token "=" value
...parse the field value into a disp-type (disposition type) and a
sequence of parameters (pairs of name (token) and value). Lower-case
all disposition types and parameter names.
If the field value does not conform to the grammar (such as when not
exactly one disposition type is specified), ignore the whole header
field.
D.3. Checking Cardinality Constraints
If the parameter sequence contains multiple instances of the same
parameter name, ignore the whole header field.
D.4. Post-Process Parameter Values
For each parameter, post-process the associated value part according
to the grammar:
o According to Section 3.2.1 of [RFC5987] for parameters using the
RFC 5987 syntax (such as "filename*"). If this fails, just ignore
this parameter.
o According to the grammar for quoted-string (Section 2.2 of
[RFC2616]) for values starting with a double quote character (").
o Verbatim otherwise.
Note that this step starts with an octet sequence obtained from the
HTTP message, and results in a sequence of Unicode characters.
D.5. Extracting the Disposition Type
The parsing step (Appendix D.2) has returned the disposition type (to
be matched case-insensitively), which can be "attachment", "inline",
or an extension type. If the type is unknown, treat it like
"attachment" (see Section 3.2).
D.6. Determining the File Name
The parsing and post-processing steps resulted in a set of parameters
(name/value pairs). The suggested file name is the value of the
"filename*" parameter (when present), otherwise the value of the
"filename" parameter.
If neither is given, the UA can determine a name based on the
associated URI; for instance based on the last path segment.
Otherwise, the UA ought to post-process the suggested filename
according following Section 3.3. [[anchor10: We could say here that
UAs may reject filenames for security reasons, such as those with a
path separator character.]]
-- snip --
I'm still nervous going even so far; imagining how much additional text
we'd need in Part 1..7 to do this for all headers.
Best regards, Julian
Received on Friday, 10 December 2010 10:00:02 UTC