Re: TICKET 259: 'treat as invalid' not defined

On 12.11.2010 08:53, Julian Reschke wrote:
> On 12.11.2010 05:58, Mark Nottingham wrote:
>> I'm confused. I thought that we were going to talk about error
>> handling in an appendix, but it appears you're starting to talk about
>> it here.
>
> 1) Yes, it should be an appendix.
>
> 2) Well, it's parsing advice. It appears that some readers have trouble
> understanding how to derive a parsing strategy from the way how we
> currently write specs, so this is an attempt to describe just that.
 > ...

Here's an updated proposal (see also 
<http://trac.tools.ietf.org/wg/httpbis/trac/attachment/ticket/259/i259.diff>):

-- snip --Appendix D.  Parsing

    This document does not require any specific handling of invalid
    header field values.  With this in mind, the text below describes a
    simple strategy for parsing the header field and detecting problems
    in general, or in specific parameters.

D.1.  Combine Multiple Instances of Content-Disposition

    If the HTTP message contains multiple instances of the Content-
    Disposition header field, combine all field values into a single one
    as specified in Section 4.2 of [RFC2616].

D.2.  Parsing for Disposition Type and Parameters

    Using the simplified grammar below:

      field-value = disp-type *( ";" param )
      disp-type   = token
      param       = token "=" value

    ...parse the field value into a disp-type (disposition type) and a
    sequence of parameters (pairs of name (token) and value).  Lower-case
    all disposition types and parameter names.

    If the field value does not conform to the grammar (such as when not
    exactly one disposition type is specified), ignore the whole header
    field.

D.3.  Checking Cardinality Constraints

    If the parameter sequence contains multiple instances of the same
    parameter name, ignore the whole header field.

D.4.  Post-Process Parameter Values

    For each parameter, post-process the associated value part according
    to the grammar:

    o  According to Section 3.2.1 of [RFC5987] for parameters using the
       RFC 5987 syntax (such as "filename*").  If this fails, just ignore
       this parameter.

    o  According to the grammar for quoted-string (Section 2.2 of
       [RFC2616]) for values starting with a double quote character (").

    o  Verbatim otherwise.

    Note that this step starts with an octet sequence obtained from the
    HTTP message, and results in a sequence of Unicode characters.

D.5.  Extracting the Disposition Type

    The parsing step (Appendix D.2) has returned the disposition type (to
    be matched case-insensitively), which can be "attachment", "inline",
    or an extension type.  If the type is unknown, treat it like
    "attachment" (see Section 3.2).

D.6.  Determining the File Name

    The parsing and post-processing steps resulted in a set of parameters
    (name/value pairs).  The suggested file name is the value of the
    "filename*" parameter (when present), otherwise the value of the
    "filename" parameter.

    If neither is given, the UA can determine a name based on the
    associated URI; for instance based on the last path segment.

    Otherwise, the UA ought to post-process the suggested filename
    according following Section 3.3. [[anchor10: We could say here that
    UAs may reject filenames for security reasons, such as those with a
    path separator character.]]
-- snip --

I'm still nervous going even so far; imagining how much additional text 
we'd need in Part 1..7 to do this for all headers.

Best regards, Julian

Received on Friday, 10 December 2010 10:00:02 UTC