Re: JSON headers from Andy Green on 2016-07-11 (ietf-http-wg@w3.org from July to September 2016)

From: Andy Green <andy@warmcat.com>
Date: Mon, 11 Jul 2016 12:37:19 +0800
To: "Martin J." Dürst <duerst@it.aoyama.ac.jp>, Poul-Henning Kamp <phk@phk.freebsd.dk>, Julian Reschke <julian.reschke@gmx.de>
Cc: Yanick Rochon <yanick.rochon@gmail.com>, Phil Hunt <phil.hunt@oracle.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <1468211839.6746.67.camel@warmcat.com>

On Mon, 2016-07-11 at 13:05 +0900, Martin J. Dürst wrote:
> Hello Paul-Henning,
> 
> On 2016/07/11 07:20, Poul-Henning Kamp wrote:
> 
> > If we instead, as I propose, require that JSON headers *never* be
> > split, then it becomes both possible and rather obviously smarter
> > to define this header as a JSON object, keyed by the media type:
> > 
> >  Accept: {      \
> >   "text/plain": <JSON for "q=0.5">, \
> >   "text/html": <JSON for no parameter>, \
> >   "text-xdvi": <JSON for "q=0.8">, \
> >   "text/x-c": <JSON for no parameter> \
> >  }
> > 
> > A sender wishing to modify the priority, just sets the
> > corresponding JSON object using the native languages
> > JSON facility:
> > 
> >  req.accept["text/plain"] = <JSON for "q=0">
> 
> My understanding is that you are extremely concerned about the speed
> at 
> which headers can be processed. My guess would be that
> deserializing, 
> changing, and reserialising JSON headers takes more time than 
> detecting/processing duplicate headers. But I of course might be
> wrong.

I'm a bit bemused why the world needs JSON headers instead of the cool
stuff for header coding in http/2, but I can give one point of view
related to duplicate headers and efficiency.

In libwebsockets we use bytewise state machines for everything,
including http/1.x header parsing.  Normally the library tries to stay
out of the way of the application code and provide events and
information to it as it becomes available, without the need for
buffering on the library side.

But in the case of http/1.x headers, we can't give the application any
definitive report on header payload until we got the whole lot, since
headers may be appended to at any point.  Further, it means we have to
keep the whole payload of every recognized header around in case it was
subject to appending later.

Actually it'd be nice, and efficient, if we could assemble one header
payload in the library, pass it up to the application to copy or act
on, and then reuse the buffer, as we go through the incoming, possibly
fragmented, header content.

Parsing the JSON out of it is very cheap and quite compatible with
being integrated into the general header stream, bytewise parser, but
deferring being able to get a definitive result to pass up as it is now
in http/1.x is painful if you are serious about memory efficiency.

In the actual scenario being asked about if you can eliminate always
having the final result for a header pending until all the headers came
(in JSON, by saying each header may only appear once and multiple
results come in an array on that), you can reissue the headers to
forward header-by-header as they are parsed without storing them all,
which is radically more efficient if what you're doing with them allows
it.

-Andy

> Could you give some more background on why speed-wise, de/serializing
> is 
> okay for you, but duplicate detection isn't?
> 
> > But this time we can shut them all with one single line of text:
> > 
> >  "Duplicate keys in JSON objects SHALL cause and be treated
> >  as connection failure."
> 
> How are you going to tell your favorite JSON library to behave that
> way?
> 
> Regards,   Martin.
> 
>

Received on Monday, 11 July 2016 04:37:52 UTC