Request Routing Information [was: Do we kill the "Host:" header in HTTP/2 ?] from Mark Nottingham on 2013-02-05 (ietf-http-wg@w3.org from January to March 2013)

From: Mark Nottingham <mnot@mnot.net>
Date: Tue, 5 Feb 2013 16:16:47 +1100
To: James M Snell <jasnell@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <2D6247D1-6284-4646-A53F-86DE66327AA5@mnot.net>
Thanks for making concrete proposals, James -- that's helpful.

We had a brief conversation at the F2F about requiring "special" headers (e.g., :scheme :method :host :path) to be at the beginning of the set of headers.

That's effectively a different serialisation of the information here (ignoring the separation of the port). Each approach has advantages and disadvantages, but what might help us move forward here is first figuring out *what* information needs to be separated out, before we talk about the specific format of the bits on the wire.

A few points to consider (trying to move the conversation forward, more than stating a position):

* HTTP/1.1 has two ways of serialising what we call the Effective Request URI in HTTPbis, and I don't think it's too controversial to say that this is bad, and in /2 we should just have one way to do it.

* One of the HTTP/1.1 forms omits the scheme in use. Discussion so far seems to imply that people want the scheme to be explicit in /2. Anyone have any argument as to why not?

* If we do make the scheme explicit, I'd note that HTTPbis allows use of schemes other than HTTP / HTTPS, so we'd need to accommodate that. I.e., a single bit is out.

* Most people seem to see the value in separating the authority portion of the URI into a separate header, because that's routed upon (and it could also benefit from delta-based compression). Anyone disagree?

* Separating the query string from the path would save the origin server a bit of parsing. I see arguments on both sides; who wants to make them?

* Request routing is generally done on the host/port tuple; i.e., the port doesn't have informational value *in the HTTP message* when it's separate from the port. So, I'm not sure about the value proposition of separating it out here; can you illustrate?

* We'll need to do all of this for the response status code as well. Maybe not the phrase; we touched on this briefly at the F2F, and I put forth the opinion that since it's human-readable, and our message format isn't really any more, it doesn't have much utility to actually include in the message. Anyone think it's useful enough to justify the bits?

* We also talked about :version at the F2F, both in requests and responses. I don't think it's necessary, as it's effectively hop-by-hop information, and the connection negotiation + magic takes care of that. Discuss.

Cheers,



On 02/02/2013, at 4:22 AM, James M Snell <jasnell@gmail.com> wrote:

> Based on the feedback, we can change this to...
> 
> +------------------------------+
> |S|len(method)|method|len(host)|
> +-+----+----+-+-------+--------+
> | host |port|len(path)|  path  |
> +------------------------------+
> 
> The only change here really is the port field as a uvarint. The path would contain the full query-string and path detail...
> 
> Example: GET https://example.com:443/foo?a=b
> 
>   [83,G,E,T,0B,e,x,a,m,p,l,e,.,c,o,m,BB,03,08,/,f,o,o,?,a,=,b]
> 
> If some other scheme needs to be specified, a separate :scheme header would be specified identifying the scheme. 
> 
> Method names are encoded as text because snumeric identifier or abbreviation schemes are just going to add complexity for no demonstrated benefit.
> 
> - James
> 
> 
> 
> On Thu, Jan 31, 2013 at 11:09 PM, James M Snell <jasnell@gmail.com> wrote:
> One proposal on this particular topic... 
> 
> We can combine the :scheme, :method, :host and :path header fields into a single :req Header with a compact binary encoding and require that this single header always appear first in request header blocks.
> 
> +------------------------------+
> |S|len(method)|method|len(host)|
> +-+-------+----+---------+-----+
> |   host  | len(path) |  path  |
> +------------------------------+
>  
> S = Single bit, when set, scheme = https, when not set, scheme = http
> len(method) = 7 bit length of method name
> method = name of method
> len(host) = uvarint(host)
> host = host
> len(path) = uvarint(path)
> path = path
> 
> Examples:
> 
> GET request for http://example.net/foo would be:
> 
>   [03,G,E,T,0B,e,x,a,m,p,l,e,.,c,o,m,04,/,f,o,o]
> 
> POST request for https://example.net/foo would be:
> 
>   [84,P,O,S,T,0B,e,x,a,m,p,l,e,.,c,o,m,04,/,f,o,o]
> 
> Unregistered method 'FOO' request for https://example.net/foo would be:
> 
>   [83,F,O,O,0B,e,x,a,m,p,l,e,.,c,o,m,04,/,f,o,o]
> 
> If the client chooses not to send a host (because it's not needed or whatnot) they simply set len(host) to 0...
> 
>   [03,G,E,T,00,04,/,f,o,o]
> 
> If we do end up going with delta encoding for compression, we can require that the :req Header always be passed using the eref operation (ephemeral reference, meaning that the header is never stored in the compression state). No huffman-coding would be applied to the header, making it very quick and cheap to process.
> 
> Thoughts?
> 
> - James
> 
> 
> 
> On Thu, Jan 31, 2013 at 8:04 AM, Ted Hardie <ted.ietf@gmail.com> wrote:
> On Thu, Jan 31, 2013 at 5:15 AM, Roy T. Fielding about this
> 
> >> We had no idea how early we were in the popularity curve of HTTP or
> >> how dominant it would become, but it was clear even then that the
> >> protocol would be very, very common on the network.  In retrospect, it
> >> is clear that we shouldn't have looked at the current installed
> >> base--we should have looked at what we expected eventual use would be.
> >> That makes "the earlier the better" clear.
> >
> > I think your memory is a bit hazy there ... HTTP passed all
> > application protocols other than email in 1995, and by that
> > time (Mar 1996) was roughly double email traffic, IIRC (this was
> > before email-based spam became common).  That's why the WG
> > meeting contained a lot of people who had nothing to do
> > with developing the Web protocols---there was panic in the air.
> >
> 
> I think we can both agree it only got more popular from there.  It may
> well have surpassed email by that discussion, but the hockey stick had
> a lot of run to go.
> 
> > I find it amusing that you think we could have proceeded in any
> > other way without relegating the IETF work to the garbage bin.
> >
> 
> It would have taken agreement from a lot of people, but the web
> community of the time could have decided then to rev the major
> protocol version for that change.  That suggestion was made.  The IETF
> could not have mandated that.  The IETF *never* gets to dictate
> protocol changes--it didn't have police then and it doesn't now, but
> saying that the buy-in for that change didn't happen doesn't equate to
> saying it could not.
> 
> regards,
> 
> Ted
> >> For HTTP 2.0, where we can make non-backward compatible changes, I
> >> personally think the right thing to do is to drop the Host: header
> >> (that version shift is what we were waiting for 17 years ago, after
> >> all).  If there are things folks are getting from side-effects of the
> >> Host header (e.g. proxy targeting), we put them into the bin of
> >> potential requirements for HTTP 2.0 *and create mechanisms to meet
> >> those needs*.
> >>
> >> I think Adrien's proposal for extensions to the host header makes
> >> clear that the need isn't perfectly met by the host header in any
> >> case, so mapping out the real aim and meeting that seems like the best
> >> notion to me.
> >
> > Oddly enough, waka separates the scheme+host routing information from
> > the rest of the URI because that works better with multi-argument
> > methods and message-based encryption. *shrug*
> >
> > ....Roy
> >
> 
> 
> 

--
Mark Nottingham   http://www.mnot.net/
Received on Tuesday, 5 February 2013 05:17:48 UTC