W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2007

Re: NEW ISSUE: repeating non-list-type-headers

From: Jamie Lokier <jamie@shareable.org>
Date: Tue, 20 Nov 2007 21:18:19 +0000
To: Jamie Lokier <jamie@shareable.org>
Cc: Julian Reschke <julian.reschke@gmx.de>, HTTP Working Group <ietf-http-wg@w3.org>, David Morris <dwm@xpasc.com>
Message-ID: <20071120211819.GB20642@softmodem.org>

Jamie Lokier wrote:
> 
> Julian Reschke wrote:
> > Now this seems to be kind of backwards, wouldn't it be *much* clearer if 
> > it said:
> > 
> >    Multiple message-header fields with the same field-name MUST NOT be
> >    present in a message unless the entire field-value for that
> >    header field is defined as a comma-separated list [i.e., #(values)].
> 
> It would be clearer, but it would clash with reality.  All web servers
> and web clients use Set-Cookie, which is prohibited by that.

After reading the rest of this thread, I see that you didn't
change the meaning (I was mistaken), you simply clarified it,
(notwithstanding the subtleties of double negatives and permission
vs. not-denial etc.).  So I withdraw any objection on that basis.

However, I still have a point which is mainly response to your other
query, and I offer an alternative clarification which spells it out
more.

> > That being said, do we have a recommendation for recipients when that 
> > requirement is violated? I would assume that servers SHOULD return a 400 
> > (Bad Request), but what about clients?

Recipients don't always have the necessary information to decide which
headers have comma-separated syntax.  Some headers meaning may depend
on which resource is requested and other factors, outside the scope of
the general purpose HTTP part of an implementation.

Only recipients which _intepret_ a particular header are likely to
have this information for that header.  In that case, perhaps it's
reasonable to say _those_ SHOULD return 400 Bad Request.

However, I think that's a bit demanding.  There are quite a few client
and server implementations which parse HTTP headers into a key->value
dictionary at an early stage, folding duplicates together, and pass
that onto application code, and only application code has knowledge of
the meaning of some headers.  It works fine even on the big nasty
internet.  (Set-Cookie is handled separately).

That architecture seems reasonable to me, so I propose replacing
SHOULD with MAY, as in "... MAY return a 400 (Bad Request)".


Dave Morris wrote:
> In the end, quite simple ... if the recipient doesn't understand the
> message, it should report an error and reject the message.  [...]

I agree, and add that aspects of the message which the recipient
doesn't care about should stay ignored.

> There really isn't that much point in folding headers and in fact this
> possiblity makes parsing more difficult.

But it's required now, it really occurs in the wild with some headers.

> What a revised spec should do is
> focus on interoperability and describe requirements which insure
> interoperability...

I agree, and think the old spec is a bit weak in some
found-in-practice interop areas.

> a. The order of the values of repeated header must be preserved

Yes.

> b. The order of repeated headers known to have list values MAY be
>    folded OR unfolded at the convenience of the processing entity.

True when interpreting headers, but please don't write a proxy which
forwards those folded/unfolded headers - especially not a
"transparent" proxy.  A few buggy clients/servers do process the two
differently, and occasionally one needs to be explicit as a workaround
for some problem, and proxies "normalising" things does not help.

> Where it doesn't matter, the specification should not impose restrictions
> since there is no power of enforcement.

I echo that, when it comes to things like what to do when sent some
kinds of technically malformed message.  However, restrictions which
say what to send (and what not) are good for interoperability, as are
requirements that insist everyone parse different but equivalent
things the same way.

How about this.  It's a bit long, but I think it's clear, reflects
common practice as well as suggesting good practice, includes Julian's
suggestion to reject (but only when appropriate), and is equivalent to
Daves suggestion "folded or unfolded at the convenience" without
putting it that way.



Proposed text:

Duplicate headers
=================

1. Duplicate headers means duplicate headers with the same
   field-name.  Case differences and LWS before the colon MUST be
   ignored in the comparison.

2. Messages MUST NOT have duplicate headers, except as permitted:

      + Headers whose field-value syntax is a comma-separated list.

      + More generally, when explicitly permitted by other
        specifications and applications, whose syntax is such that
        concatenating syntactically valid values with "," (with and
        without surrounding LWS) does not change the interpretation.

      + Headers received and forwarded unmodified by a proxy (except
        leading and trailing LWS and multi-line formatting changes,
        and field-name case changes).

      + Set-Cookie in a response message, due to historical accident.

3. An implementation SHOULD NOT reject a message for containing
   duplicate headers unknown to the implementation.

4. At the point where specific headers are interpreted during message
   processing, if duplicates are present and not permitted as
   described above, the message SHOULD be rejected as malformed.

5. An implementation MAY reject the message earlier using a list of
   headers for which duplicates are not permitted (e.g., at least
   those defined in this specification whose syntax is not a
   comma-separated list).

6. The meaning of duplicate headers whose field-value syntax is a
   comma-separated list, provided the individual values satisfy that
   syntax, is equivalent to concatening the elements of each list,
   preserving the order.  The transformation of section 7 gives the
   same result.  Implementations MUST respect this equivalence.

7. When interpreting any header, implementations MAY merge duplicates
   by concatenating the values with "," between them (optionally with
   LWS), preserving the order.  This is permitted for all headers and
   independent of syntax.  In practice, some implementations do merge
   all duplicate headers in this way internally, except for
   Set-Cookie, and the technique does satisfy this specification.
   However, see sections 4 and 5 for preferred behaviours.

8. When a proxy forwards particular headers without modification
   (except leading and trailing LWS and multi-line formatting changes,
   and field-name case changes), duplicate headers MUST be forwarded
   separately in their original order.  A proxy may still apply
   sections 4, 5, 6 and 7 separately to header interpretation, and it
   may replace duplicate headers with the concatenated form for those
   headers whose value is modified prior to forwarding.


-- Jamie
Received on Tuesday, 20 November 2007 21:17:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:50:23 GMT