Re: WebSocket2 from Takeshi Yoshino on 2016-10-04 (ietf-http-wg@w3.org from October to December 2016)

From: Takeshi Yoshino <tyoshino@google.com>
Date: Tue, 4 Oct 2016 20:12:25 +0900
To: Van Catha <vans554@gmail.com>
Cc: Kari Hurtta <hurtta-ietf@elmme-mailer.org>, Ilari Liusvaara <ilariliusvaara@welho.com>, HTTP working group mailing list <ietf-http-wg@w3.org>
Message-ID: <CAH9hSJaMsKaoTK+kr2X_GP_T7=jcDQtFLSusYrV+nDWCadcyxg@mail.gmail.com>
Hi Van, Kari,

On Tue, Oct 4, 2016 at 1:39 AM, Van Catha <vans554@gmail.com> wrote:

<snip>


> Kari Hurtta
>
> > Well, I think the following would work and avoid SETTINGS:
> >
>

Yutaka's proposal required the SETTINGS_WEBSOCKET_CAPABLE SETTINGS
parameter because we planned to use special framing (use of HTTP2 framing
level features not used for HTTP1.1/h2 layering). We wanted allow endpoints
to make sure that the h2 intermediaries between them are able to forward
WS/HTTP2's special framing correctly.

So, for the current proposal by Van where only the DATA frame is used as
well as HTTP1.1/h2 layering, it's unnecessary.


> > -> :method ws2
> > -> :scheme wss
> > -> :authority foo.example
> > -> :path /bar
> > -> <optional extra parameters, e.g. compression support>
> > <- :status 200
> > <- sec-ws2-ack 1
> > <- <optional negotiated extras>
>
>
:scheme is needed, yes.

Not sure about need for "sec-ws2-ack". For WebSocket/TCP, we employed the
Sec-WebSocket-Key/Accept challenge/response in order to prevent the
WebSocket protocol from being abused for cross protocol attacks e.g. SMTP.
If all the intermediaries and servers correctly investigate the :scheme
header and don't get confused, "sec-ws2-ack" is unnecessary. Regarding
ws/h2-RFC6455 bridging, a correctly implemented intermediary facing ws/h2
capable node and ws/h2 non-capable node would just perform RFC 6455
handshake as Kari suggested. So, no problem.

We can also consider taking care of bad proxies who don't check :scheme and
just converts data between h2 and HTTP/1.1/TCP. Then, including a special
header for capability validation like this "sec-ws2-ack" might be a good
choice. I understand that people are feeling that the
Sec-WebSocket-Key/Accept got to be too much complex.

But "sec-ws2-ack" doesn't work for the signaling what Yutaka wanted to
realize by SETTINGS_WEBSOCKET_CAPABLE.

<snip>

About Proxies:
> ~
> I assumed the concern was with forward / reverse proxies like NGINX
> forwarding http/2 to http.
>
> Afaik HTTP/2 browser only allow using TLS, so a HTTP transparent proxy
> will not be able to "proxy" anything unless the reverse proxy serves a MITM
> certificate.  I do not think this is a common enough use case.
> ~
>

Are you suggesting that the WS2/h2 should be indistinguishable from
HTTP1.1/h2 from intermediaries and server without knowledge of the
websocket2- headers or application level knowledge? Or just suggesting that
we can reduce complexity by ignoring all the v2-v1 bridging at proxies?

Do you have any strong opinion about how WS2 should be exposed on the web
platform? The WebSocket API, some brand new API? Or are you only interested
in defining a standardized messaging protocol to be layered over HTTP2 and
QUIC to utilize their power?


> About Settings frame:
> ~
> If the idea behind this is to make WebSocket2 compatible over HTTP/1.1
> then that is part of the reasons why I advocate to avoid SETTINGS frame.
>
> If WebSocket2 can be negotiated and used in such a way to avoid locking to
> the transport layer, it can easily be used in HTTP/1.1 as well.
>

This comment also strikes me that you want a standardized messaging
protocol over arbitrary bidirectional byte streams.

<snip>

Also HTTP/1.1 would have no chance at getting WebSocket2. If such a need
> were to ever exist, currently I am not considering it at any priority level.
> ~
>

Got it.


> About Payload Length optimizations:
> ~
> There is a particular use case where if you sent lots of small messages
> that end up in a single HTTP/2 DATA frame or a single system level packet,
> there is an overhead of 3 extra bytes and 6 for LZ4 compressed payloads.
>

How does this 3 break down?

Reading the following revised header proposal, I looks the minimum is 2
octets.
- 2 LZ4 decompressed payload length field's length bits
- 2 compression bits
- 2 payload length field's length bits
- 2 type bits
- 1 octet at minimum for text and binary message

and 2 + 4 (decompressed byte size) = 6 for LZ4?


> Using flags to specify how many bits the payload size is we can remove
> some flexibily but also decrease final system level packet size:
>
> RSV now becomes 6 bits.
> TYPE now becomes 2 bits. We have room for 1 more frame type.
>
> Last 2 RSV bits are Payload Length. 0 for no length (error frame), 1 for 1
> byte, 2 for 2 byte, 3 for 4 byte.
>

To me, 4 octet fixed-length payload length header looks competitive for its
simplicity, but this variable length doesn't compared to the RFC 6455
length encoding. I understand it ended up to look weird, but it has good
representation power and efficiency, and there're existing code resources
developed for implementing RFC 6455. What do you think about just using RFC
6455's length format?


> Second last 2 RSV bits are for compression.   0 is no compression, 1 is
> lz4, 2 is deflate.  3 is reversed.
>

Unless there we expect needs for switching between the two or more
compression algorithms in a single connection, we don't need to give a
dedicated bit for each algorithm.

The first 2 RSV bits could be also defined as generic compression parameter
field.


> First 2 RSV bits are decompressed payload size in the case of LZ4 or
> reserved for future use by compression.
>
> This also has the benefit of not requiring to keep a serverside state for
> LZ4 compressed payloads; for anyone that would have a sane usecase for that
> :)
>
>
> Before a small LZ4 Text frame looked like this:
> b00000000 0x00 0x00 0x00 0x09 0x00 0x00 0x00 0x05 0xAE 0xB4
>
> Now it can look like:
> b01010100 0x09 0x05 0xAE 0xB4
>
> From 11 bytes down to 5.
>
>
> The reason I think a 32bit UINT max is a good top value is that gives you
> about 4G~ of a maximum payload.  If you have a payload greater than 4G
> there is no sane way a client API can process that. Even 4G is way too much
> ~
>

I understand your rationale though when dealing with very large data, the
overhead of the length header is relatively small and doesn't matter so
much.
Received on Tuesday, 4 October 2016 11:13:15 UTC