Re: HTTP/2 and Websockets from Yutaka Hirano on 2014-10-01 (ietf-http-wg@w3.org from October to December 2014)

From: Yutaka Hirano <yhirano@google.com>
Date: Wed, 1 Oct 2014 16:53:48 +0900
To: Robert Collins <robertc@robertcollins.net>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABihn6FwK7uZpg6v7u9byTw6xNsTjFsWfv_QyPMReTtumaXJWA@mail.gmail.com>
Thank you for the comments!

If you find small problems (like 4.2 and 4.2.1), creating issues on github (
https://github.com/yutakahirano/ws-over-http2) also works.


> I don't understand: DATA is subject to flow control, but I don't see
> any prose about buffering (or not buffering) of DATA frames.

As it is not explicitly prohibited, intermediaries may buffer data, I
think. From the WebSocket point of view, we want intermediaries flush the
buffered data at each message end.


> Unless we
> define a new frame time and explicitly prohibit buffering (but we need
> to keep it flow controlled), I don't see what would be different. And
> more importantly, intermediaries might still choose to buffer a new ws
> frame type anyway: we know that what the spec requires and what people
> do will differ :).

As we send SETTINGS_WEBSOCKET_CAPABLE to notify that the HTTP/2 connection
can be used for WebSocket and we use a new frame type, I hope
intermediaries can deal with it. IIUC an HTTP/2 extension can create a new
frame type, but cannot modify the meaning of existing frame types (e.g.
DATA). Even if it is allowed, I would create a new frame type not to
confuse implementers and [non-confirming] intermediaries.

All the implementor discussion I've seen during the
> HTTP/2 discussions has focused on how intermediaries want to be
> scalable: and buffering is anti-scaling. So - is it a pragmatic
> concern, or do we expect DATA stream buffering to take place [outside
> of protocol gateways converting to HTTP/1.1 where non upload can
> require buffering - and note that such a gateway can't carry ws anyway
> unless its aware of it, and if its aware of it, it can make sure it
> does not buffer].
> >>
> >> Lastly, it seems (to me) that it would be desirable to allow
> >> PUSH_PROMISE setup of websockets connections
> >
> > Can you tell me why it is desirable?
> Same reason for PUSH_PROMISE of any stream, to avoid setup latency. If
> the server is pushing down some javascript libraries that will want a
> ws stream back to the same server, it could setup the stream in-line.

Ah, I see, but I don't have the answer.


> Ok, so my questions...
> Firstly, I found the 'frame is overloaded, only specify when its
> unambiguous' really hard for reading. I'd like to propose that we
> fully qualify the frame type everywhere. 'h2 SETTINGS frame', 'ws
> RST_STREAM frame' etc: doing otherwise requires a lot more mental
> state, at least for me :).
>
> 3.2 seems like an antifeature: ALPN is for defining the base protocol
> spoke, if you define a new ALPN name for things that speak websockets
> over HTTP/2, but it still has all the same framing and flow control
> etc, I think it just adds complexity.
> There are two alternatives here:
>  - ws[s] over TCP, RFC 6355, which using ALPN to select directly could
> make sense [though nothing out there does it AFAIK?]
>  - ws[s] over HTTP/2. which should be using ALPN to select *HTTP/2*,
> since thats the protocol we need.
> I say this because if adding ws to HTTP/2 makes a new ALPN, then when
> we add a third thing (nt) we'll get:
> h2
> h2ws
> h2nt
> h2wsnt
> And so on - its problematic :). And SETTING is sufficient to let us know.

The problem is, we have the native WebSocket (h1ws). There are four
possibilities:
 - The server prefers h2 over h1 and h2ws over h1ws
 - The server prefers h2 over h1 and h1ws over h2ws
 - The server prefers h1 over h2 and h2ws over h1ws
 - The server prefers h1 over h2 and h1ws over h2ws
The ALPN things are required to deal with the complex situation without
introducing a new RTT.

If nt doesn't have h1nt, we don't have to introduce h2nt ALPN protocol -
Just sending SETTINGS frame and starting a handshake is enough.
Even if nt does have h1nt, we don't have to introduce h2wsnt. h2xx is
needed only when the first stream in the h2 connection is intended to use
xx protocol. We can't have a "wsnt" stream.


> Consider this: an h2 capable browser with ws over h2 support is asked
> to connect to ws.example.com with ws. If it:
>  - tries for h2
>  - examines its peer setting for WEBSOCKET_CAPABLE
>  - if present tries a HEADERS frame
>  - if absent falls back to RFC6455
> Then thats pretty sane IMO. If we get this deployed, it will work most
> of the time, and when it doesn't, the fallback is there.

That depends on the question: How common is a server to prefer h2 over h1
and h1ws over h2ws?
I thought it was not so uncommon, so wrote the negotiation using ALPN. If
we get a consensus that it is very rare, we can use a simpler solution.


>
> 3.3
> This section seems confused about the semantics of SETTING - see my
> new post about clearing that up (whether I need the clearing up, or
> the spec does :)).

Hmm, I thought that when one endpoint  sends a SETTINGS frame, each
intermediary must relay the frame if (and only if) it understand the value,
is that right?


> Anyhow, settings is about advertising, not about negotiating - you
> can't negotiate a path through an intermediary graph using it, because
> there is no way for the intermediary to signal upstream that just one
> client supports it and another doesn't. So the intermediary needs to
> signal its support unconditionally. Likewise, the server can't use the
> peer's WEBSOCKET_CAPABLE setting to replace the Sec-headers, because
> a) a websocket capable browser might still have bad js executing and
> trying to attack things, and b) an intermediary may be connecting
> multiple different clients to the same server.

I didn't intend to replace Sec-headers with WebSocket_CAPABLE. :scheme
header replaces Sec-headers.


>
> Here is an attempt at new prose:
> 3.3.  WebSocket over HTTP/2 capability
>    Servers and intermediaries that can process WebSocket requests over
> HTTP/2 MUST advertise the SETTINGS_WEBSOCKET_CAPABLE setting in their
> SETTINGS frame.
>    Websocket over HTTP/2 MUST NOT be attempted to peers that have not
> set SETTINGS_WEBSOCKET_CAPABLE
>    Clients do not need to advertise this capability as their making a
> valid websockets request signals they are capable.
> 3.4 secure connection
>    If the h2 connection is not secure, wss connections MUST NOT be
> intiated over it.
>    If the h2 connection is secure, both ws and wss connections MAY be
> initiated over it.
> 3.5 Intermediaries
>    Intermediaries that have advertised SETTINGS_WEBSOCKET_CAPABLE may
> receive websocket requests which are for origins that do not advertise
> SETTINGS_WEBSOCKET_CAPABLE (or may not even support HTTP/2). For
> example, it is nearly certain that a forward proxy that speaks HTTP/2
> will receive requests for origins that have not yet upgraded to
> HTTP/2.
>   In such a situation Intermediaries MUST either: initiate a RFC6455
> websocket connection with the origin, and translate frames between the
> two sides in conformance with both RFC6455 and this RFC. Or they may
> return 501 (Not Implemented) to indicate that they cannot forward the
> request.
>   To illustrate: consider a Client(C), a websocket aware Proxy(P) and
> a Server(S).
>   P will include SETTINGS_WEBSOCKET_CAPABLE in its SETTINGS frame from P
> to C.
>   C sees that P is capable, and initiates a websocket connection over
> HTTP/2 to Origin S.
>   P initiates an HTTP/2 connection with S
>   S is a baseline HTTP/2 server and does not include
> SETTINGS_WEBSOCKET_CAPABLE in its SETTINGS frame from S to P.
>   At this point P can either error with 501, signalling that this
> particular request cannot be carried, or it can fall back to RFC6455
> on behalf of the client.
> 4.1
> You say we can skip "Upgrade, Connection, Sec-WebSocket-Key, and
> Sec-WebSocket-Version", because we don't need to do verification. I
> think this prose is missing an explanation of why we don't need to do
> verification.
> There are two failure modes RFC6455 talks about:
> A - connections to existing SMTP etc servers
> B - submitting data from FORM posts to ws servers
> The former is guarded against by looking for a ws specific handshake
> from the server.
> The latter is guarded against by looking for a ws specific header from
> the client *which Javascript APIs do not permit javascript code to
> set*.
> Your draft defines ws over existing HTTP/2 connections and also new
> connections to HTTP/2 endpoints. If we limit ourselves to just ws[s]
> over existing HTTP/2 connections, then we maybe we can say:
> A) is protected against by RFC6455, and any new connection made for ws
> should follow that spec with one exception: If the server negotiates
> as a valid HTTP/2 endpoint, then the SETTINGS_WEBSOCKET_CAPABLE
> setting from the server is inspected to determine if ws over HTTP/2
> can be used - and that supplances the server side calculation that was
> used to prove websocket readiness in RFC6455.
> B)  the presence of Sec-Websocket-Key and Sec-WebSocket-Version is
> used to ensure that a WS endpoint doesn't get form data posted to it.
> I see no replacement for that in your draft: we need to keep it,
> because its protecting against javascript programming models. (Unless
> I've missed something?)

As we have :scheme and HEADERS is decoupled from the body, I think we don't
need Sec-headers. They are needed because non-confirming endpoints
mistakenly recognize HTTP-like structure that are generated by the script,
but It will not happen with H2.


> 4.2
> '"101" or "101 Switching Protocols"' - AIUI in HTTP/2 the reason text
> is gone. The status pseudo header is numeric only.

Thanks, I will update.


> 4.2.1 - the ALTSVC draft suggests doing this gracefully - e.g.
> openning up the new connection then dropping the old one. We should
> include a reason for not following that advice.

I will update.


>
> 5.
> There's nothing specified here - neither what frame types we need to
> add, nor discussion on the [in]applicability of HTTP/2 DATA. I'd like
> to try to use HTTP/2 data I think - the discussion about frame type
> compatibility makes me think that we'll be more compatible with
> RFC6455 if we just tunnel over the h2 DATA frame: remember that
> RFC6455 targetted TCP as a transport, and a series of h2 DATA frames
> is most analogous to that. In particular, if we use dedicated control
> frames, we could hit out of order behaviour with control frames
> forward before DATA frames, because DATA frames are flow controlled:
> it will be more complex to specify, and I don't see a benefit.

Ah, there were some proposals there. See
http://tools.ietf.org/html/draft-hirano-httpbis-websocket-over-http2-00 and
https://github.com/yutakahirano/ws-over-http2/blob/master/ws-over-http2-message-mapping.md
for example. As the HTTP/2 spec changed, I deleted them.
Everyone is excited about the data framing, but what to represent is much
more important than how to represent it, I think.


> 7
> I'm not aware of any equivalent to the masking in HTTP/2, and there is
> no discussion of BEAST in the HTTP/2 spec: if we're delegating to
> HTTP/2 to solve those issues, I think we need to talk about that now
> :)

Yeah, I exactly delegate to HTTP/2. I'm certain that If we are in danger,
HTTP/2 is, too. I'm not certain if we are in danger.
Most of WebSocket-related security bugs were caused by making
[non-conforming] intermediaries / endpoints that don't understand WebSocket
interpret the data as HTTP. So I don't want to give them any chance of
confusion. That is another reason why I don't like using h2 DATA frames.


On Tue, Sep 30, 2014 at 6:46 AM, Robert Collins <robertc@robertcollins.net>
wrote:

> On 29 September 2014 18:26, Yutaka Hirano <yhirano@google.com> wrote:
> > Hi,
> >
> > I am proposing a spec draft:
> > http://tools.ietf.org/html/draft-hirano-httpbis-websocket-over-http2-01
> .
> > Since many modifications were made on on the HTTP/2 spec, some
> description
> > may be obsolete. Please let me know if you find any flaw.
>
> Cool - I have some questions, below.
>
> >> Secondly, we need to define websocket frame mappings. The least work,
> >> and I suspect the easiest for implementors, would be to put all the
> >> websocket frames into HTTP/2's data frames, without worrying about
> >> frame alignment: just treat the fully open stream as a series of bytes
> >> in the same way TCP is treated by the websocket spec.
> >> I suspect however that a better result would be achieved by defining
> >> custom HTTP/2 frames, since websockets already have the basic support
> >> for multiplexing (large application writes can be fragmented into
> >> smaller frames as needed), we shouldn't run into HOL blocking issues.
> >
> > Yeah, we can't simply use DATA frames because intermediaries may buffer
> > data. The HTTP/2 spec had "MSG_DONE" once and I wanted to use it, but it
> was
> > removed from the spec. Currently I think introducing a new frame type is
> the
> > right way.
>
> I don't understand: DATA is subject to flow control, but I don't see
> any prose about buffering (or not buffering) of DATA frames. Unless we
> define a new frame time and explicitly prohibit buffering (but we need
> to keep it flow controlled), I don't see what would be different. And
> more importantly, intermediaries might still choose to buffer a new ws
> frame type anyway: we know that what the spec requires and what people
> do will differ :). All the implementor discussion I've seen during the
> HTTP/2 discussions has focused on how intermediaries want to be
> scalable: and buffering is anti-scaling. So - is it a pragmatic
> concern, or do we expect DATA stream buffering to take place [outside
> of protocol gateways converting to HTTP/1.1 where non upload can
> require buffering - and note that such a gateway can't carry ws anyway
> unless its aware of it, and if its aware of it, it can make sure it
> does not buffer].
>
> >>
> >> Lastly, it seems (to me) that it would be desirable to allow
> >> PUSH_PROMISE setup of websockets connections
> >
> > Can you tell me why it is desirable?
>
> Same reason for PUSH_PROMISE of any stream, to avoid setup latency. If
> the server is pushing down some javascript libraries that will want a
> ws stream back to the same server, it could setup the stream in-line.
>
> Ok, so my questions...
>
> Firstly, I found the 'frame is overloaded, only specify when its
> unambiguous' really hard for reading. I'd like to propose that we
> fully qualify the frame type everywhere. 'h2 SETTINGS frame', 'ws
> RST_STREAM frame' etc: doing otherwise requires a lot more mental
> state, at least for me :).
>
>
> 3.2 seems like an antifeature: ALPN is for defining the base protocol
> spoke, if you define a new ALPN name for things that speak websockets
> over HTTP/2, but it still has all the same framing and flow control
> etc, I think it just adds complexity.
>
> There are two alternatives here:
>  - ws[s] over TCP, RFC 6355, which using ALPN to select directly could
> make sense [though nothing out there does it AFAIK?]
>  - ws[s] over HTTP/2. which should be using ALPN to select *HTTP/2*,
> since thats the protocol we need.
>
> I say this because if adding ws to HTTP/2 makes a new ALPN, then when
> we add a third thing (nt) we'll get:
> h2
> h2ws
> h2nt
> h2wsnt
>
> And so on - its problematic :). And SETTING is sufficient to let us know.
>
> Consider this: an h2 capable browser with ws over h2 support is asked
> to connect to ws.example.com with ws. If it:
>  - tries for h2
>  - examines its peer setting for WEBSOCKET_CAPABLE
>  - if present tries a HEADERS frame
>  - if absent falls back to RFC6455
>
> Then thats pretty sane IMO. If we get this deployed, it will work most
> of the time, and when it doesn't, the fallback is there.
>
> 3.3
> This section seems confused about the semantics of SETTING - see my
> new post about clearing that up (whether I need the clearing up, or
> the spec does :)).
>
> Anyhow, settings is about advertising, not about negotiating - you
> can't negotiate a path through an intermediary graph using it, because
> there is no way for the intermediary to signal upstream that just one
> client supports it and another doesn't. So the intermediary needs to
> signal its support unconditionally. Likewise, the server can't use the
> peer's WEBSOCKET_CAPABLE setting to replace the Sec-headers, because
> a) a websocket capable browser might still have bad js executing and
> trying to attack things, and b) an intermediary may be connecting
> multiple different clients to the same server.
>
> Here is an attempt at new prose:
>
> 3.3.  WebSocket over HTTP/2 capability
>
>    Servers and intermediaries that can process WebSocket requests over
> HTTP/2 MUST advertise the SETTINGS_WEBSOCKET_CAPABLE setting in their
> SETTINGS frame.
>
>    Websocket over HTTP/2 MUST NOT be attempted to peers that have not
> set SETTINGS_WEBSOCKET_CAPABLE
>
>    Clients do not need to advertise this capability as their making a
> valid websockets request signals they are capable.
>
> 3.4 secure connection
>
>    If the h2 connection is not secure, wss connections MUST NOT be
> intiated over it.
>    If the h2 connection is secure, both ws and wss connections MAY be
> initiated over it.
>
> 3.5 Intermediaries
>
>    Intermediaries that have advertised SETTINGS_WEBSOCKET_CAPABLE may
> receive websocket requests which are for origins that do not advertise
> SETTINGS_WEBSOCKET_CAPABLE (or may not even support HTTP/2). For
> example, it is nearly certain that a forward proxy that speaks HTTP/2
> will receive requests for origins that have not yet upgraded to
> HTTP/2.
>
>   In such a situation Intermediaries MUST either: initiate a RFC6455
> websocket connection with the origin, and translate frames between the
> two sides in conformance with both RFC6455 and this RFC. Or they may
> return 501 (Not Implemented) to indicate that they cannot forward the
> request.
>
>   To illustrate: consider a Client(C), a websocket aware Proxy(P) and
> a Server(S).
>   P will include SETTINGS_WEBSOCKET_CAPABLE in its SETTINGS frame from P
> to C.
>   C sees that P is capable, and initiates a websocket connection over
> HTTP/2 to Origin S.
>   P initiates an HTTP/2 connection with S
>   S is a baseline HTTP/2 server and does not include
> SETTINGS_WEBSOCKET_CAPABLE in its SETTINGS frame from S to P.
>   At this point P can either error with 501, signalling that this
> particular request cannot be carried, or it can fall back to RFC6455
> on behalf of the client.
>
> 4.1
> You say we can skip "Upgrade, Connection, Sec-WebSocket-Key, and
> Sec-WebSocket-Version", because we don't need to do verification. I
> think this prose is missing an explanation of why we don't need to do
> verification.
>
> There are two failure modes RFC6455 talks about:
> A - connections to existing SMTP etc servers
> B - submitting data from FORM posts to ws servers
>
> The former is guarded against by looking for a ws specific handshake
> from the server.
> The latter is guarded against by looking for a ws specific header from
> the client *which Javascript APIs do not permit javascript code to
> set*.
>
> Your draft defines ws over existing HTTP/2 connections and also new
> connections to HTTP/2 endpoints. If we limit ourselves to just ws[s]
> over existing HTTP/2 connections, then we maybe we can say:
> A) is protected against by RFC6455, and any new connection made for ws
> should follow that spec with one exception: If the server negotiates
> as a valid HTTP/2 endpoint, then the SETTINGS_WEBSOCKET_CAPABLE
> setting from the server is inspected to determine if ws over HTTP/2
> can be used - and that supplances the server side calculation that was
> used to prove websocket readiness in RFC6455.
>
> B)  the presence of Sec-Websocket-Key and Sec-WebSocket-Version is
> used to ensure that a WS endpoint doesn't get form data posted to it.
> I see no replacement for that in your draft: we need to keep it,
> because its protecting against javascript programming models. (Unless
> I've missed something?)
>
> 4.2
> '"101" or "101 Switching Protocols"' - AIUI in HTTP/2 the reason text
> is gone. The status pseudo header is numeric only.
>
> 4.2.1 - the ALTSVC draft suggests doing this gracefully - e.g.
> openning up the new connection then dropping the old one. We should
> include a reason for not following that advice.
>
> 5.
> There's nothing specified here - neither what frame types we need to
> add, nor discussion on the [in]applicability of HTTP/2 DATA. I'd like
> to try to use HTTP/2 data I think - the discussion about frame type
> compatibility makes me think that we'll be more compatible with
> RFC6455 if we just tunnel over the h2 DATA frame: remember that
> RFC6455 targetted TCP as a transport, and a series of h2 DATA frames
> is most analogous to that. In particular, if we use dedicated control
> frames, we could hit out of order behaviour with control frames
> forward before DATA frames, because DATA frames are flow controlled:
> it will be more complex to specify, and I don't see a benefit.
>
> 7
> I'm not aware of any equivalent to the masking in HTTP/2, and there is
> no discussion of BEAST in the HTTP/2 spec: if we're delegating to
> HTTP/2 to solve those issues, I think we need to talk about that now
> :)
>
>
> -Rob
>
> --
> Robert Collins <rbtcollins@hp.com>
> Distinguished Technologist
> HP Converged Cloud
>
Received on Wednesday, 1 October 2014 07:54:16 UTC