HTTP/2 GREASE experiment results

Hi,

Chrome users are experiencing GREASE intolerance on https://app.slack.com,
see https://crbug.com/1127060 for details.  If any developers from Slack
read this, please respond on the ticket (or to me privately).

(I already sent the rest of this e-mail three days ago as a response to a
different thread, but mistakenly from my other address, so it didn't go
through.  Also it possibly deserves its own thread.)

Here is a summary of recent HTTP/2 GREASE experiments in Chrome.  The
findings can inform deployment of new HTTP/2 extensions.  Note that I have
not performed any server-side experiments, so I have no information about
clients' intolerances in the wild.  I am aware of a bug of unknown setting
identifier intolerance in very old OkHttp client versions, not sure how
widely they are still used.

I'm only naming implementations for reference, so that people can
check for old versions and gauge the state of the ecosystem.  By no
means do I mean to shame anybody.  In fact I introduced a number of
bugs in Chrome's GREASE implementation, both correctness issues and
crashers, that substantially slowed down my experiments.

Last fall it was discovered that ATS does not tolerate unknown setting
identifiers.  This bug has already been fixed by the time Chrome users
ran into it.  The fix got backported to 8.0.6 and by now it is widely
deployed.  50% of Chrome Dev and Canary has been sending reserved
setting identifiers again since May.

This spring it was discovered that WinHTTP and certain
Microsoft-operated cloud-based web services cannot handle an HTTP/2
request without a body if it is serialized as a HEADERS frame without
the END_STREAM flag followed by an empty DATA frame with the
END_STREAM flag.  (This was done by Chrome in order to insert a
reserved frame in between.)  This is not GREASE-intolerance, but a
failure of HTTP/2 compliance nonetheless.  The fix has been deployed
for cloud services, but WinHTTP has not been fixed, and "it’s possible
that this will impact other IIS sites using the 'ARR' load
balancer/router".

Two weeks ago I restarted the reserved frame type experiment, but
without splitting bodyless requests this time.  Reserved frames are
sent after the SETTINGS and SETTINGS ACK frames on stream 0, and
before every DATA frame (in practice this means POST requests).  It
has been reported that LiteSpeed responds to the ones on stream 0 with
a PING ACK, which causes Chrome to close the connection since there
has been no unacked PINGs.  The LiteSpeed team is already fixing it.

And there is the issue with https://app.slack.com, see above.

This experiment is too recent to conclude that there are no
medium-sized intolerant deployments still out there, but I think it is
safe to assume that there are no large intolerant deployments.  (And
as far as small ones, we might never be able to find them, at least
not with Dev and Canary channels only.)

Overall I would say the ecosystem is in a pretty good shape when looking
from the client side.

Bence

Received on Friday, 11 September 2020 16:01:15 UTC