Re: draft-kinnear-httpbis-http2-transport questions from Andy Green on 2019-03-23 (ietf-http-wg@w3.org from January to March 2019)

From: Andy Green <andy@warmcat.com>
Date: Sat, 23 Mar 2019 09:17:43 +0800
To: Eric Kinnear <ekinnear@apple.com>
Cc: ietf-http-wg@w3.org, Kari Hurtta <hurtta-ietf@elmme-mailer.org>
Message-ID: <1ebbbecc-8348-485d-a7cb-289ca2bd3829@warmcat.com>
On 23/03/2019 06:39, Eric Kinnear wrote:
> Hi Andy, Kari,
> 
> Thanks for the comments and discussion here!
> 
> As Kari mentioned, this does specify two new :protocol names and is a 
> fairly straightforward extension to RFC8441.
> 
> In terms of negotiation of subprotocols/next protocol there are a few 
> options that come to mind:
> 
> (1) Treat this like TCP and leave it up to the data running over this 
> arbitrary bytestream to negotiate what it wants to do. Note that 
> hostname/port/etc. are all just as available here as to TLS. That’s some 
> of what’s currently in the document, but I suspect we’ll want more than 
> that, for many of the reasons outlined in your emails.
> However, folks using existing protocols transported over H/2, which is 
> one of the main use cases here, are probably okay with just (1). This 
> also enables exactly the same API surface/mechanisms that are currently 
> used for that protocol over TLS, except now it’s transparently using a 
> single transport connection underneath.

TLS is a good example... modern TLS does not take this completely opaque 
transport approach either... see SNI and ALPN / NPN.  They also found 
that it's necessary to have standardized OOB characterization of what it 
will carry.  In practice even h2 itself requires ALPN so the two sides 
can agree on "h2" as the protocol the opaque tunnel will carry.

Since that's only available at the network connection level but h2 can 
mux independent streams (upgraded to independent protocols...) inside 
it, I think you also have to take on replicating ALPN / ws subprotocol 
type metadata at that layer too, one way or another.

> (2) The path is present and can be useful in designating different 
> “endpoints” for different types of services. This is also in the 
> document, and aims to look very similar to how WebSocket handles this in 
> 8441. As Andy notes, this doesn’t quite provide a scenario where the 
> client offers something and the server picks it. If you have multiple 
> protocols and just need different “endpoints”, you can likely use this 
> in combination with (1) above without too much trouble.

You can rely on implied stuff like URL conventions as much as you want 
if you control both sides, and just need to do it once with no 
interoperability.  But for cases beyond that, you need to spell it out 
normatively how it should work.

Isn't it simpler to call this protocol metadata out into its own thing 
rather than overload the url path, in terms of defining it?  In some 
implementations the protocol handler will be intimately integrated with 
the whole URL space, but in others, it will not natively know what a URL 
is and definitely not know or care about at least the leftmost part of 
it already "dereferenced" by the server above it.

> (3) Incorporate a header that contains a list of next protocols and lets 
> the responder choose. This can be very useful, but it’s not strongly 
> defined in the current version of the document. I’d be open to adding 
> it, and offhand it seems like it would look pretty much like WebSocket’s 
> version in 8441.
> 
> (4) A different mechanism that would be helpful here? Ideally, it would 
> be common to anyone and any protocol wanting to use the extended CONNECT 
> handshake.

ws has the concept of a degenerate case where the protocol isn't 
given... you get a "default protocol" for that vhost whatever that means 
to the server (and it may mean something different to the client...).  I 
really recommend not copying that and enforcing roll-your-own protocol 
naming along the lines of "com.apple.myproduct.myprotocol.version".

> Operatively, the main point of the extended CONNECT part is to say 
> “here’s this great thing for WebSocket, let’s generalize it to the base 

I agree, but deployment situation is dismal on 8441.  And that has the 
advantage of ws semantics on the client side ultra-widely deployed.  So 
I also think as mentioned some examples selling why you would deploy it 
would be helpful in there.  For example, there's nothing saying why for 
your use-case, taking the hit to add 8441 encapsulated ws framing is out 
of the question.  Then you can just use 8441.

> of what other :protocol values will need”.
> In doing that, we also define a new :protocol that covers a number of 
> use cases, but most certainly not everything that the extended CONNECT 
> method could be useful for.
> Future :protocol values should be able to build on top of this, since it 
> defines a minimal set (i.e. without WebSocket specific items) of 
> functionality that is needed for generic use of that handshake.

Yes... I agitated a long while ago for this to directly be part of h2, 
at least something that was enough to carry ws in the base spec in a way 
compatible with the ws JS api.  so I agree it has been a missing 
piece...8441 largely ticked the box in terms of getting the 
(considerable) mux advantages but it has some deliberate inefficiencies 
and ws focus since it came to be from taking the temperature of what 
browser vendors were actually willing to discuss / implement not what 
was ideal.

-Andy

> Thanks,
> Eric
> 
> 
> 
>> On Mar 22, 2019, at 10:24 PM, Andy Green <andy@warmcat.com 
>> <mailto:andy@warmcat.com>> wrote:
>>
>>
>>
>> On March 23, 2019 1:32:53 AM GMT+08:00, Kari Hurtta 
>> <hurtta-ietf@elmme-mailer.org <mailto:hurtta-ietf@elmme-mailer.org>> 
>> wrote:
>>> Andy Green <andy@warmcat.com <mailto:andy@warmcat.com>>: (Fri Mar 22 
>>> 08:28:49 2019)
>>>>
>>>> What it creates is a single registered upgrade type "bytestream" that
>>> is
>>>> anonymous and has no description of what protocol or versioning or
>>>> subframing is inside it.
>>>>
>>>> From ws perspective, it's the same as a ws protocol that doesn't
>>>> provide subprotocol names... the vhost can only offer one of those
>>> and
>>>> if you made that mistake, this doesn't have the escape hatch of
>>>> subprotocol negotation that ws has.  If different vendors feel that
>>>> different things should go in the anonymous transport, more than one
>>>> vendor's products couldn't talk to the same vhost.
>>>
>>> It is not per vhost, it is per URL which includes also path.
>>
>> From a ws perspective the subprotocols buy you something different 
>> than path.
>>
>> Ws also lets the upgrade use the path, since the path is there... and 
>> that's useful since the path may bind to url space on the vhost for 
>> purposes like basic auth requirement, which can also then apply to the 
>> http part of the ws upgrade cleanly.  In ws what's on the path is left 
>> wholly for a server + specific ws protocol to define or ignore.
>>
>> Subprotocol is a negotiation...the client lists what he can handle and 
>> the server picks the best one he can handle, once at upgrade time, and 
>> informs the client in the response. So you can deal with multiple 
>> versions in the field, rolling upgrades, and even eg, population of 
>> devices with different vendor protocols and quirks once.
>>
>> Subprotocol is standardized outside of some arbitrary vendor url 
>> schema.  So packet inspection can know what is supposed to be in the 
>> upgraded stream.
>>
>> Subprotocols allow the server to have one implementation which may be 
>> used by different vhosts, these may then be administratively enabled 
>> and disabled per-vhost at the server.
>>
>> Clients can use the negotiation to support different server 
>> capabilities in one implementation cleanly.
>>
>> There are other pressures on path to match applications and url space 
>> layout in the vhost, so it's not a free choice just for upgrades.
>>
>>> Same vhost can provide different protocol on different path.
>>>
>>> https://tools.ietf.org/html/rfc8441#section-4
>>>
>>> |   o  On requests that contain the :protocol pseudo-header field, the
>>> |      :scheme and :path pseudo-header fields of the target URI (see
>>> |      Section 5) MUST also be included.
>>>
>>> https://tools.ietf.org/html/draft-kinnear-httpbis-http2-transport-01#section-3.1
>>>
>>> |   Endpoints using this mechanism to establish byte stream or datagram
>>> |   tunnels over HTTP/2 streams follow the CONNECT handshake procedure
>>> |   defined in [RFC6455].  However, instead of supplying "websocket"
>>> for
>>> |   the :protocol psuedo-header field to indicate a WebSocket
>>> connection,
>>> |   they specify "bytestream" or "datagram" to indicate a byte stream
>>> or
>>> |   datagram connection, respectively.
>>> |
>>> |   The :scheme and :path psuedo-headers are required by [RFC6455].
>>> The
>>> |   scheme of the target URI MUST be set to "https" for both byte
>>> stream
>>> |   and datagram tunnels.  The path is used in the same manner as for
>>> the
>>> |   WebSocket protocol, and MAY be set to "/" (an empty path component)
>>> |   if not desired for use.
>>>
>>> You are saying that path is not enough to identify desired endpoint ?
>>
>> No, you could put, eg, ?subprotocol=xxx on the path... it's not the 
>> same as ws style subprotocol negotiation but it will select a 
>> protocol.  You can put &quirks= &version= too.
>>
>> However...
>>
>> - that draft does not define conventions for doing that
>>
>> - the draft specifies only two opaque, abstract transport :protocol names
>>
>> Actually as it is if a vhost gets this request, he can't be sure of 
>> the client assumptions about what he'll read into the url and then no 
>> idea what's going to be coming from the opaque protocol name. 
>>  Anything that thought it might inspect the packet must learn 
>> everybody's random path conventions...
>>
>> They can get around it by making the :protocol specify what it is in 
>> there, but that must go in the registry each time... it'd work for 
>> 'ethernet-encapsulation' or so though.
>>
>> Or they can normatively tell how to specify the subprotocol another way.
>>
>>> Is subprotocols really used on Websocket ?
>>
>> Yes.
>>
>>> I suspect that path is changed instead.
>>
>> There are lots of ways to do it.  From what I have seen from many 
>> people using ws the first time, they are happy to ignore subprotocol 
>> and path and get on with their implementation.  But then they want to 
>> support more than one protocol on the same vhost.  Ws upgrades have no 
>> normative binding to server url space, they exist outside it unless 
>> the server implementation enforces something about path... they are 
>> normatively bound by subprotocol name.
>>
>> -Andy
>>
>>> / Kari Hurtta
>
Received on Saturday, 23 March 2019 01:18:17 UTC