Re: Call for Adoption: HTTP/2 Bis from Willy Tarreau on 2020-12-11 (ietf-http-wg@w3.org from October to December 2020)

From: Willy Tarreau <w@1wt.eu>
Date: Fri, 11 Dec 2020 20:57:43 +0100
To: Bence Béky <bnc@google.com>
Cc: Mike Bishop <mbishop@evequefou.be>, Ian Swett <ianswett@google.com>, David Benjamin <davidben@chromium.org>, Cory Benfield <cory@lukasa.co.uk>, Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>, Tommy Pauly <tpauly@apple.com>
Message-ID: <20201211195743.GA21550@1wt.eu>

Hi Bence,

On Fri, Dec 11, 2020 at 02:36:58PM -0500, Bence Béky wrote:
> As Willy mentioned, doing GREASE with Chrome has somewhat promising
> results.  I'm not aware of any sites that have problems with reserved
> setting identifiers.  There are a number of sites that have issues with
> reserved frame types, most notably https://slack.com.  However, all three
> server bugs I'm aware of causing GREASE intolerance have been fixed, so
> moving the ecosystem forward is a matter of upgrading (which I acknowledge
> can be quite complex).  Also, it is a lucky coincidence that the particular
> issue affecting slack.com only affects frame types above 0x20,

This description exactly matches the bug you found that affected prior
haproxy versions, so possibly they're running on this and haven't applied
maintenance fixes (sounds scary though given that it would mean that
other much more serious issues wouldn't be fixed either). If that's it
we shouldn't even care with such issues as they are things of the past
given that software causing them have long been fixed.

> There is also a GREASE intolerant middlebox product out there, and while
> that bug has been fixed in recent releases, some older branches are still
> supported.  In fact no available upgrade includes the fix for certain
> hardware platforms.  See https://crbug.com/1127569#c9 for specifics.  I do
> not know if this bug is triggered by frame type 0x10, or only larger values.

Similarly, by the time a new H2 spec is out, bogus versions will have
totally disappeared.

> I guess these server bugs will be around for a little while, but not
> forever.

I have zero compassion for site operators which refuse to apply fixes.
I'm fine with slow processes requiring months of validation but not with
lazy or irresponsible admins not doing their most basic job. So I really
think we should not care a single second about deployments running on
older versions of software which was already fixed. In the worst case
these versions will disappear once a critical vulnerability appears in
their version or when the admin (ir)responsible for keeping it exposed
leaves the company. What should really matter to us is fundamental
incompatibilities caused by spec mis-interpretation. At this point it
looks like H2 doesn't suffer from such a problem at all.

> I'm planning to implement PRIORITY_UPDATE in Chrome by the next release,
> and based on what we've seen from GREASE, we might indeed run into issues
> with sending the new frame and the SETTINGS_DEPRECATE_HTTP2_PRIORITIES
> identifier, as Ian mentioned above.  The experiments so far happened only
> on Chrome Dev and Canary channels, and an issue with a site as popular as
> https://netflix.com has only been reported after two months, so it is still
> possible that there is a bug with a website that is not popular enough that
> I would have learned about it yet, but popular enough that it will block
> PRIORITY_UPDATE launch.

What I'm wondering is if we should plan a normalized method to report
implementation bugs to servers. We could for example have a .well-known
URL where a few information are sent (e.g. a link to the agent's bug
tracker). This way site owners could spot them in their logs when they
see many errors in their stats.

> Then there is the situation from the server's perspective.  An old version
> of an HTTP/2 client library is intolerant to unknown setting identifiers:
> it simply closes the connection if it receives any.  This causes
> difficulties for WebSockets over HTTP/2 and also PRIORITY_UPDATE.  Also, I
> am not aware of any GREASE experiments from the server side, so I do not
> know what the situation is with reserved frame types.

I doubt GREASE makes as much sense from infrastructure to clients. I mean,
there is incentive for fixing infrastructure components when millions of
clients face problems with a given site. But the other way around is much
different. There's a wide variety of clients around, some that users cannot
even update, and these ones will continue to occasionally fail without the
user having any effective power on them. This will only result in a bad
image for the sites triggerring the issues, for no perceived value for these
sites.

Just my two cents,
Willy

Received on Friday, 11 December 2020 19:58:12 UTC