Re: Proposal: Adopt State Synchronization into HTTPbis from Michael Toomim on 2024-10-10 (ietf-http-wg@w3.org from October to December 2024)

From: Michael Toomim <toomim@gmail.com>
Date: Wed, 9 Oct 2024 18:10:07 -0700
To: Watson Ladd <watsonbladd@gmail.com>
Cc: ietf-http-wg@w3.org
Message-ID: <5573afb1-23b1-4f2e-9d99-2f82dbcd4987@gmail.com>
Thanks, Watson, for these excellent questions!

Let's address them:

On 10/9/24 9:50 AM, Watson Ladd wrote:
> On Tue, Oct 8, 2024 at 4:16 PM Michael Toomim<toomim@gmail.com>  wrote:
>> Servers—which *authoritatively know* when resources change—will promise to tell clients, automatically, and optimally. Terabytes of bandwidth will be saved. Millions of lines of cache invalidation logic will be eliminated. Quadrillions of dirty-cache bugs will disappear. In time, web browser "reload" buttons will become obsolete, across the face of earth.
> That assumes deployment and that this works pretty universally. I'm
> less sanguine about the odds of success.
>
> This happy state holds after every single intermediary and HTTP
> library is modified to change a basic request-response invariant.

No, the "happy state" does not require universal adoption. The benefits 
(of performance, bug elimination, code elimination, and 
interoperability) are accrued to whichever subnetworks of HTTP adopt 
them. Let's say I have a simple client/server app, currently using SSE 
or a WebSocket. If I switch to this RESS standard, my app's state will 
become more interoperable, more performant, require less code, have 
fewer bugs, and will have libraries to provide extra features (e.g. 
offline mode) for free.

This doesn't require other apps to adopt it. And (AFAIK) it doesn't 
require intermediaries to support it.

We are confirming we can run transparently through legacy intermediaries 
in tests right now— but we've already got running production apps 
working fine, so the track record is already great.

> Servers have a deeply ingrained idea that they don't need to
> hold long lived resources for a request. It's going to be hard to
> change that

Actually we do this already today. SSE holds responses open for long 
periods of time, and works great.

When a connection dies, the client just reconnects. It's fine.

> and some assets will change meaningfully for clients
> outside of the duration of a TCP connection (think e.g. NAT, etc).

This is a different problem, and is solved by the other Braid 
extensions— specifically versioning and merge-types. These extensions 
enable offline edits, with consistency guarantees upon reconnection.

Not all apps will need this. The apps that just need subscriptions still 
get value from subscriptions. Apps that need offline edits can use the 
other braid extensions and add OT/CRDT support.

> Subscriptions are push based, HTTP requests are pull based. Pulls
> scale better: clients can do a distributed backoff, understand that
> they are missing information, recover from losing it. Push might be
> faster in the happy case, but it is complex to do right. The cache
> invalidation logic remains: determining a new version must be pushed
> to clients is the same as saying "oh, we must clear caches because
> front.jpg changed". We already have a lot of cache control and HEAD to
> try to prevent large transfers of unchanged information. A
> subscription might reduce some of this, but when the subscription
> stops, the client has to check back in, which is just as expensive as
> a HEAD.

It's almost sounding here like you're arguing that programmers should 
only write pull-based apps, and should not write a push-based app?

Pull-based apps usually have some polling interval, which wastes 
bandwidth with redundant requests, and incurs a delay before updates can 
be seen. Is that what you're talking about? You can't do realtime that way.

Realtime apps like Figma push updates in realtime. So does Facebook, 
Google Search (with instant search suggestions), and basically every app 
that uses a WebSocket. Yes, this architecture is more sophisticated— 
Figma implements CRDTs! But it's awesome, and the web is going in this 
direction. Programmers are writing apps that push updates in realtime, 
and they need a standard.

> I don't really understand the class of applications for which this is
> useful. Some like chat programs/multiuser editors I get: this would be
> a neat way to get the state of the room.

I'll make a strong statement here— this is useful for any website with 
dynamic state.

Yes, chats and collaborative editors have dynamic state, where realtime 
updates are particularly important. But dynamic state exists everywhere. 
Facebook and Twitter push live updates to clients. Gmail shows you new 
mail without you having to click "reload." News sites update their pages 
automatically with new headlines. The whole web has dynamic state, now. 
Instead of writing custom protocols over WebSockets, these sites can get 
back to HTTP and REST — except now it will be RESS, and powerful enough 
to handle synchronization within the standard infrastructure, in an 
interoperable, performant, and featureful way.

> It also isn't clear to me
> that intermediaries can do anything on seeing a PATCH propagating up
> or a PUT: still has to go to the application to determine what the
> impact of the change to the state is.

Yes, they can't today, but we will solve this when we need to — this is 
the issue of Validating and Interpreting a mutation outside of the 
origin server. Today, you have to rely on the server to validate and 
interpret a PUT or PATCH. But when we're ready, we can write specs for 
how any peer can validate and interpret a PUT or PATCH independently.

This will be a beautiful contribution, but again not all apps need it 
yet, and there's a lot of value to be gained with just a basic 
subscription mechanism. We can solve the big problems one piece at a 
time, and different subnets of HTTP can adopt these solutions at their 
own pace, and for their own incentives.
>>        Request:
>>
>>           GET /chat
>>           Subscribe: timeout=10s
>>
>>        Response:
>>
>>           HTTP/1.1 104 Multiresponse
>>           Subscribe: timeout=10s
>>           Current-Version: "3"
>>
>>           HTTP/1.1 200 OK
>>           Version: "2"
>>           Parents: "1a", "1b"
>>           Content-Type: application/json
>>           Content-Length: 64
>>
>>           [{"text": "Hi, everyone!",
>>             "author": {"link": "/user/tommy"}}]
>>
>>           HTTP/1.1 200 OK
>>           Version: "3"
>>           Parents: "2"
>>           Content-Type: application/json
>>           Merge-Type: sync9
>>           Content-Length: 117
>>
>>           [{"text": "Hi, everyone!",
>>             "author": {"link": "/user/tommy"}}
>>            {"text": "Yo!",
>>             "author": {"link": "/user/yobot"}]
> *every security analyst snaps around like hungry dogs to a steak*
> Another request smuggling vector?

Request Smuggling is a strong claim! Can you back it up with an example 
of how you'd smuggle a request through a Multiresponse?

I don't think it's possible. Usually Request Smuggling involves some 
form of "Response Splitting" that behaves differently on upgraded vs. 
legacy implementations. But there's no ambiguity here. Legacy 
implementations just see an opaque Response Body. Upgraded 
implementations see a set of Multi-responses, each distinguished 
unambiguously via Content-Length.

I'd love to see an example attack.

> How does a busy proxy with lots of internal connection reuse distinguish updates
> as it passes them around on a multiplexed connection? What does this
> look like for QUIC and H/3?

That's simple. Each Multiresponse—just like a normal response—exists on 
its own stream within the multiplexed TCP or QUIC connection. The Proxy 
just forwards all the stream's frames from upstream to downstream, on 
the same stream.

Each Multiresponse corresponds to a single Request, just like regular 
HTTP Responses.

>> This will (a) eliminate bugs and code complexity; while simultaneously (b) improving performance across the internet, and (c) giving end-users the functionality of a realtime web by default.
> We have (c): it's called WebSockets. What isn't it doing that it
> should be?

Ah, the limitation of WebSockets is addressed in the third paragraph of 
the Braid-HTTP draft:

    https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http#section-1.1

    1.  Introduction

    1.1.  HTTP applications need state Synchronization, not just Transfer

        HTTP [RFC9110] transfers a static version of state within a single
        request and response.  If the state changes, HTTP does not
        automatically update clients with the new versions.  This design
        satisficed when webpages were mostly static and written by hand;
        however today's websites are dynamic, generated from layers of state
        in databases, and provide realtime updates across multiple clients
        and servers.  Programmers today need to *synchronize*, not just
        *transfer* state, and to do this, they must work around HTTP.

        The web has a long history of such workarounds.  The original web
        required users to click reload when a page changed.  Javascript and
        XMLHTTPRequest [XHR] made it possible to update just part of a page,
        running a GET request behind the scenes.  However, a GET request
        still could not push server-initiated updates.  To work around this,
        web programmers would poll the resource with repeated GETs, which was
        inefficient.  Long-polling was invented to reduce redundant requests,
        but still requires the client to initiate a round-trip for each
        update.  Server-Sent Events [SSE] finally created a standard for the
        server to push events, but SSE provides semantics of an event-stream,
        not an update-stream, and SSE programmers must encode the semantics
        of updating a resource within the event stream.  Today there is still
        no standard to push updates to a resource's state.

        In practice, web programmers today often give up on using standards
        for "data that changes", and instead send custom messages over a
        WebSocket -- a hand-rolled synchronization protocol.  Unfortunately,
        this forfeits the benefits of HTTP and ReST, such as caching and a
        uniform interface [REST].  As the web becomes increasingly dynamic,
        web applications are forced to implement additional layers of
        non-standard Javascript frameworks to synchronize changes to state.

Does that answer your question? WebSockets give up on using HTTP. Every 
programmer builds a different subprotocol over their websocket. Then the 
(increasingly dynamic) state of websites ends up inaccessible and 
obscured behind proprietary protocols. As a result, websites turn into 
walled gardens. They can openly link to each other's *pages*, but they 
cannot reliably interoperate with each other's internal *state*.

This will change when the easiest way to build a website is the 
interoperable way again. We get there by adding Subscription & 
Synchronization features into HTTP. This is the missing feature from 
HTTP that drives people to WebSockets today. Programmers use HTTP for 
static assets; but to get realtime updates they give up and open a 
WebSockets and a new custom protocol to subscribe to and publish state 
over it. We end up with yet another piece of web state that's 
proprietary; hidden behind some programmer's weird API. We can't build 
common infrastructure for that. CDNs can't optimize WebSocket traffic.

We solve this by extending HTTP with support for *dynamic* state; not 
just *static*. Then programmers don't need WebSockets. They use HTTP for 
all state; static *and* dynamic. They don't have to design their own 
sync protocol; they just use HTTP. The easiest way to build a website 
becomes the interoperable way again. CDNs get to cache stuff again.

Thank you very much for your questions. I hope I have addressed them here.

Michael
Received on Thursday, 10 October 2024 01:10:15 UTC