Re: draft-ietf-httpbis-header-structure: handling multiple field values

On Tue, May 12, 2020 at 1:47 PM Julian Reschke <julian.reschke@gmx.de>
wrote:

> On 12.05.2020 19:39, Ian Clelland wrote:
> > This is mentioned in
> >
> https://httpwg.org/http-extensions/draft-ietf-httpbis-header-structure.html#rfc.section.4.2
>  --
> > "parsers MUST combine all lines in the same section (header or trailer)
> > that case-insensitively match the field name into one comma-separated
> > field-value", (with the warning given that strings split across multiple
> > field values will have "unpredictable results") -- So I don't think
> > you're allowed to parse them separately. If both exist in the same
> > message, they must be combined before parsing.
> > ...
>
> Indeed. Looking at this again, I realize that a paragraph below then
> confused me:
>
> "Strings split across multiple field lines will have unpredictable
> results, because comma(s) and whitespace inserted upon combination will
> become part of the string output by the parser. Since concatenation
> might be done by an upstream intermediary, the results are not under the
> control of the serializer or the parser."
>
> I read this to mean that errors might be detected early or not, but
> maybe this is just a warning that the actual string used for
> concatenation can vary?
>
> If that's the intent, I'd call that a spec bug. A string value split
> across multiple field instances is very clearly a violation of what HTTP
> says about list-shaped header fields, and not allowing a recipient to
> detect that seems incorrect to me.
>

Definitely a spec bug -- not sure which spec though.
7230 reads:

> A sender MUST NOT generate multiple header fields with the same field name
> in a message unless either the entire field value for that header field is
> defined as a comma-separated list [i.e., #(values)] or the header field is
> a well-known exception (as noted below).


Perhaps what it should also mention is that the header must be defined as a
comma-separated list, *and* the split must be between list elements, in
cases where the field value can contain commas with other semantic meanings.

It goes on to say:

> A recipient MAY combine multiple header fields with the same field name
> into one "field-name: field-value" pair, without changing the semantics of
> the message, by appending each subsequent field value to the combined field
> value in order, separated by a comma.


and maybe the phrase "without changing the semantics of the message" means
that the server is only free to join the fields if it doesn't change the
semantics (implying indirectly that the field shouldn't have been split up
within a quoted string in the first place), but it doesn't really read that
way.

Ian

Received on Tuesday, 12 May 2020 18:45:27 UTC