- From: Alcides Viamontes E <alcidesv@shimmercat.com>
- Date: Mon, 1 Aug 2016 16:10:30 +0200
- To: HTTP Working Group <ietf-http-wg@w3.org>
- Cc: Cory Benfield <cory@lukasa.co.uk>
- Message-ID: <CAAMqGzZJMN_duiG4t4VbpNA_1pE+yxEamG0hgn38aSeztaNEsA@mail.gmail.com>
Hi! TL;DR: I also think that trying to fit HTTP headers in anything else other than their current representation is a bad idea. But creating a semi-formal compilation of rules and behaviours for core HTTP headers would be worth it. Long rant: Recently we revamped how ShimmerCat handles HTTP headers and we ended up creating a separate library with a "Headers Document Object Model". The bare minimum set of different headers we needed to understand and manipulate to offer basic functionality is 20, and for each of them we needed to take the following into account: * Representation round-trip: How header ASCII values are parsed to "things that the program can easily manipulate" (see next), and the other way around, how to convert to ASCII values. This is slightly different for HTTP/1.1 and HTTP/2, because of connection specific headers, the "Cookie: " header, and the rather non-trivial dance with "Host: " and ":authority:". * What in-memory representation makes sense for the program: "Date: " should be a date, "Cookie: " is a dictionary, "Set-Cookie: " is a set indexed by cookie name, path and perhaps other attributes (exercising the RFC with Wordpress teaches you one or two things), "Forwarded: " is actually a list, "Link: " headers from the point of view of a server doing HTTP/2 Push are all different beasts each getting their own thing, and so on. This of course is very program specific and probably not generally interesting, but it is easier to talk about data structures instead of ASCII text when defining: * How header values combine: there shouldn't be more than one "Date: " in a given response, even if both a proxy server and an application may try to stamp a "Date: ". However, a server may "add" cookies to an application response, and the "Forwarded" header needs to be composed in a sequence. Similar decisions are needed with CORS headers, Link headers, Cache-Control, Etag and so on. * Headers are extensible, so one needs default policies for header values where there are no RFC dispositions. I would find daunting the task of fitting all the idiosyncrasies and behaviours of HTTP headers in a common bytes representation (serialisation) without some kind of updated compendium of what they do and how they behave. For example, it would be nice to have a doc similar to RFC 4229, with formalised candidate data structures and algorithms for how intermediaries in different roles should handle the headers. Furthermore, some HTTP headers are more important/common than others (is there anybody using the "From:" header?), or they are relevant to different roles, so maybe we need to group headers in some sensible way (so that we can say, e.g. "my CMS emits core http content headers" or "my server is caching-compliant because it interprets correctly http core caching headers" or "my server/application implements correctly the security measures implied by the core security HTTP headers", whatever any of these can be). On Mon, Aug 1, 2016 at 2:30 PM, Cory Benfield <cory@lukasa.co.uk> wrote: > > > On 1 Aug 2016, at 11:50, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote: > > > > No matter what we decide, we cannot change how JSON defined their > > dicts, and consequently whatever we do needs to be mapped into JSON, > > python, $lang's data models somehow. > > JSON, sure, but don’t let Python hold you back. All supported versions of > Python have an OrderedDict in their standard library. And any Python tool > dealing with HTTP has inevitably had to invent something like a > CaseInsensitiveOrderedMultiDict in order to deal with HTTP headers, so any > tool that’s likely to deal with this kind of thing is already swimming in > dictionary representations that we can use for ordering fields in header > values. > > So just to clarify: the lack of ordering in a JSON object is a reasonable > problem with using JSON, but that doesn’t mean we can’t use ordered > representations in other serialisation formats. Programming languages have > all the abstractions required to do this, and it’s just not that hard to > write an Ordered Mapping in $LANG that wraps $LANG’s built-in Mapping type. > (Hell, some Python interpreters have *all* dicts ordered, such that they > define OrderedDict by simply writing “OrderedDict = dict”). > > Cory > ./Alcides
Received on Monday, 1 August 2016 17:01:03 UTC