Re: JSON headers from Willy Tarreau on 2016-07-10 (ietf-http-wg@w3.org from July to September 2016)

From: Willy Tarreau <w@1wt.eu>
Date: Sun, 10 Jul 2016 20:26:21 +0200
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: Julian Reschke <julian.reschke@gmx.de>, Phil Hunt <phil.hunt@oracle.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20160710182621.GA7848@1wt.eu>

On Sun, Jul 10, 2016 at 10:10:26AM +0000, Poul-Henning Kamp wrote:
> --------
> In message <94e4a5c2-3465-fef3-6221-d9f4fcccb5fa@gmx.de>, Julian Reschke writes
> :
> 
> >But right now the spec *is* written to use the list construct, and I 
> >believe that's a good thing, as it's IMHO better to consider multiple 
> >instances as legal, and require the definition of the header field to 
> >deal with it.
> 
> I think it is a bad thing.
> 
> It prevents streaming processing of headers, since you never know
> when you have the full picture for a particular header, until you've
> received them all and seen that there are no more instances.
> 
> It means also means that either you have to rewrite the headers, or
> all your code needs to do the brute-force collection scan and handle
> an array of headers for further processing.  Both of which is wasteful
> in terms of CPU and memory.
> 
> I see no advantages that come even close to compensating for those
> disadvantages, but if I have overlooked something, please enlighten me...

I have a different appreciation on this matter. We've seen over the year
that people were introducing headers for a specific usage and were pretty
sure these ones ought to be unique. But in the end they were not. The best
example of this is X-Forwarded-For. Each reverse-proxy adds it. Most
implementations never used to consider it as a list as nothing was clear
regarding this, and initial designs were simply "setting" the header.
Some naive designs just used to append it after all headers. Some would
replace the CR LF CR LF with "CR LF xff CR LF CR LF". Each of them was
pretty sure to do the right thing regarding a supposedly unique header.

But they were wrong. In fact they were doing the right thing for a list
and not the right thing for a single header. And that's fortunate because
now we all know that seeing chained XFF is something very common when you
start to stack a load balancer and a cache for example.

All this to say we cannot decide in advance how headers will be produced
nor what they will mean. But that's not a problem. The problem is with
how the headers are interpreted. That's what we need to work on. I'd
instead suggest that all new headers are lists, that they will support
multiple occurrences, and that for all those where it's not obvious what
multiple occurrences will mean, we explicitly mention that only the first
one should be used. We can then have a predictible behaviour because the
first header found allows us to take a routing decision on the fly,
unless we know something specific regarding this one. And this way,
appending new values (a la XFF) will have no unexpected side effect on
those not willing to consume them.

Willy

Received on Sunday, 10 July 2016 18:26:53 UTC