Re: Proposal: Adopt State Synchronization into HTTPbis from Michael Toomim on 2024-10-13 (ietf-http-wg@w3.org from October to December 2024)

From: Michael Toomim <toomim@gmail.com>
Date: Sat, 12 Oct 2024 18:32:53 -0700
To: Josh Cohen <joshco@gmail.com>, Watson Ladd <watsonbladd@gmail.com>
Cc: ietf-http-wg@w3.org
Message-ID: <b9a2887e-df66-48ab-a6af-c24ed7f96890@gmail.com>
Thanks, Josh, for sharing this context!

You can see some of what Josh is talking about in this demo video:

    • Early Demo of Braid-Chrome
    <https://braid.org/video/https://invisiblecollege.s3.us-west-1.amazonaws.com/braid-meeting-75.mp4#1479>
    (from Braid Meeting 75 <https://braid.org/meeting-75>)

Josh also brought up how this simplifies Javascript programming. It's 
true. I now program with local Javascript variables that are backed 
directly by HTTP:

    var render = () =>
       return `<H2>The Current Bitcoin Price is:</h2>
               ${state['https://bitcoin.org/current_price']}`

This function automatically subscribes to the 
state['https://bitcoin.org/current_price'] variable, which is an ES6 
proxy, that automatically runs a GET Subscribe: HTTP request, which 
subscribes to the bitcoin price at the server. And then each update from 
the server automatically chains back to the UI and re-runs the render 
function.

As soon as the widget stops rendering, it unsubscribes from the ES6 
Proxy, which unsubscribes in turn from the HTTP endpoint. The programmer 
doesn't have to write networking code anymore.

The programmer can mutate state, too:

    state['https://braid.org/chat'].push({message: "Hi guys!", author:
    "/user/mike"})

And the programming runtime abstracts this remote state cleanly behind a 
local variable, which the programmer can assume is already downloaded, 
and always up-to-date. This works because we've perfected the underlying 
sync model. The programmer can rely on it. Then he can program at a 
higher level of abstraction. He doesn't have to write network code 
anymore. He just reads and writes it as state, as a local variable.

And while this *supports* perfect sync algorithms (e.g. CRDT), it does 
not *depend* on them. 99% of applications don't need CRDT consistency 
guarantees. 80% work fine with long-polling. But by standardizing all of 
these mechanisms as "change" over standard HTTP semantics, they become 
interoperable with one another, and the application programmer can read 
and write all of them as local variables, without having to care how the 
state's particular server has decided to serve it up. (Some of the 
variables will just be more robust than others.)

Michael

On 10/12/24 10:40 AM, Josh Cohen wrote:
> Watson said: "I don't really understand the class of applications for 
> which this is useful. Some like chat programs/multiuser editors I get: 
> this would be a neat way to get the state of the room."
>
> I met Michael in Vancouver, on coffee break before the httpwg 
> meeting.  He told me he was going to decentralize the web and make it 
> peer to peer.  I thought to myself "omg, everyone says that"  However, 
> Michael and the Braid team bring it.  I attended a few Braid meetings 
> to see what they were up to, and it's pretty cool.  They've built 
> tooling libraries already.  One benefit that stood out to me is the 
> ability to register javascript event handlers on local JavaScript 
> variables that are synchronized to server state.  The developer can 
> set an event handler on a variable, and when it changes on the server, 
> their callback runs.  This raises the level of abstraction that the 
> web developer works at.
>
> In one of my apps, I have a settings object that can be changed on the 
> client or the server and is synchronized. That's similar to getting 
> the state of the chat room.  My implementation is a websocket/redis 
> contraption, which was fun to do, but also kind of a hassle.  By 
> integrating it into the platform there is a common way to do this, 
> especially from the protocol perspective.  JS frameworks can run wild 
> with it.
>
> I'm separating concerns here, multiresponse is just one possible 
> option for the Event Channel part.   Middleboxes could optimize for 
> fan out.  If a middlebox sees a subscription request for a 
> subscription it already has, has cached the notifications and is up to 
> date, then it can just tell the origin server that it has an 
> additional subscription, and can fan out responses to connected clients.
>
> In the Symmetric HTTP case, this seems straightforward to reason 
> about.  Since SUBSCRIBE or SUB is its own HTTP request, the middlebox 
> can recognize this, pay attention and save the subscription 
> parameters.  The NOTIFY or PUB method request and entity body are just 
> cached like other entity bodies. So a late subscriber could be caught 
> up by the middlebox, and existing subscribers get fanned out PUB http 
> requests.
>
>
>
> On Wed, Oct 9, 2024 at 12:55 PM Watson Ladd <watsonbladd@gmail.com> wrote:
>
>     On Tue, Oct 8, 2024 at 4:16 PM Michael Toomim <toomim@gmail.com>
>     wrote:
>     >
>     > Josh Cohen and I are considering publishing a new draft on
>     Subscriptions in HTTP, and as we think through the big design
>     decisions, I ran across this excellent question from Watson Ladd
>     bringing up the most fundamental question of them all:
>     >
>     > On 11/6/23 4:47 PM, Watson Ladd wrote:
>     >
>     > On Tue, Oct 31, 2023 at 7:12 PM Michael Toomim
>     <toomim@gmail.com> wrote:
>     >
>     > At IETF 118 I will present a proposal to adopt State
>     Synchronization work into HTTPbis:
>     >
>     > Braid-HTTP:
>     https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http
>     [1]
>     >
>     > <...snip...>
>     >
>     > The big sticking point for me is subscriptions. This is a deviation
>     > from the request/response paradigm that goes pretty deep into how
>     > clients and servers are coded and the libraries they use. It can of
>     > course be stuck on top of WebTransport, which might be the right way
>     > to do it, but then doesn't integrate with the other three parts.
>     >
>     > You might be better trying to layer this on top of HTTP and
>     > WebTransport, as ugly as that can be with regard to what
>     > intermediaries can do in order to get it into the hands of people
>     > faster, but if there's some strong reason not to do that I'm all
>     ears.
>     >
>     > Watson raises a basic choice in designing Subscriptions:
>     >
>     > Do we dare extend the basic request/response model to allow
>     long-lived subscriptions?
>     > Or are subscriptions better layered on top— inside a WebSocket,
>     or WebTransport connection?
>     >
>     > I argue that (1) is actually a *much* better choice. Yes, it is
>     fundamental. It extends the basic architecture of HTTP (hat tip to
>     Roy Fielding) by extending REST into RESS (REpresentational State
>     Synchronization). It adds a new basic feature to HTTP— the ability
>     to subscribe to any resource, and get notified of its changes over
>     time; throughout the entire web and HTTP ecosystem. Clients will
>     stop guessing whether to reload cache, and will stop making
>     redundant requests. Servers—which *authoritatively know* when
>     resources change—will promise to tell clients, automatically, and
>     optimally. Terabytes of bandwidth will be saved. Millions of lines
>     of cache invalidation logic will be eliminated. Quadrillions of
>     dirty-cache bugs will disappear. In time, web browser "reload"
>     buttons will become obsolete, across the face of earth.
>
>     That assumes deployment and that this works pretty universally. I'm
>     less sanguine about the odds of success.
>
>     This happy state holds after every single intermediary and HTTP
>     library is modified to change a basic request-response invariant. Most
>     won't be. In any protocol this age there are some invariants that have
>     crept in and become ossified and for HTTP 1/1 that's the request
>     response model. Pipelining doesn't necessarily work all the time, let
>     alone the interleaving you need to be efficient with TCP sockets and
>     this. Servers have a deeply ingrained idea that they don't need to
>     hold long lived resources for a request. It's going to be hard to
>     change that, and some assets will change meaningfully for clients
>     outside of the duration of a TCP connection (think e.g. NAT, etc).
>
>     Caches, particularly client caches, outlive processes.
>
>     Subscriptions are push based, HTTP requests are pull based. Pulls
>     scale better: clients can do a distributed backoff, understand that
>     they are missing information, recover from losing it. Push might be
>     faster in the happy case, but it is complex to do right. The cache
>     invalidation logic remains: determining a new version must be pushed
>     to clients is the same as saying "oh, we must clear caches because
>     front.jpg changed". We already have a lot of cache control and HEAD to
>     try to prevent large transfers of unchanged information. A
>     subscription might reduce some of this, but when the subscription
>     stops, the client has to check back in, which is just as expensive as
>     a HEAD.
>
>     >
>     > The alternative (2) is to add subscriptions on top of a
>     WebSocket or WebTransport API, separate from HTTP resource
>     semantics. But then HTTP resources themselves will not be
>     subscribable. Programmers will be annoyed, and limit their use of
>     HTTP to bootstrapping the initial HTML, CSS and Javascript;
>     migrating all the interesting state onto this separate WebSocket
>     or WebTransport mechanism, which will then require more and more
>     of HTTP's features being added back into it: like (a) being able
>     to GET the state, and also PUT changes to the it, over (b)
>     multiple content-types (e.g. text/json, text/html, and image/png),
>     while (c) supporting various PATCH types, across (d) a
>     full-featured network of Proxies, Caches, and CDNs to scale the
>     network. In conclusion, choice (2) leads to reinventing HTTP,
>     within a WebSocket/Transport... on top of HTTP.
>
>     I don't really understand the class of applications for which this is
>     useful. Some like chat programs/multiuser editors I get: this would be
>     a neat way to get the state of the room. It also isn't clear to me
>     that intermediaries can do anything on seeing a PATCH propagating up
>     or a PUT: still has to go to the application to determine what the
>     impact of the change to the state is.
>
>     >
>     > The clear way forward is subscribing directly to HTTP and REST
>     state. An elegant way to see this is extending Request/Response
>     into Request/Multiresponse. The Subscription can be a Request that
>     receives multiple Responses, one for each update to the resource.
>     There are many ways to format a Multiresponse; here's a
>     straightforward and backwards-compatible option:
>     >
>     >       Request:
>     >
>     >          GET /chat
>     >          Subscribe: timeout=10s
>     >
>     >       Response:
>     >
>     >          HTTP/1.1 104 Multiresponse
>     >          Subscribe: timeout=10s
>     >          Current-Version: "3"
>     >
>     >          HTTP/1.1 200 OK
>     >          Version: "2"
>     >          Parents: "1a", "1b"
>     >          Content-Type: application/json
>     >          Content-Length: 64
>     >
>     >          [{"text": "Hi, everyone!",
>     >            "author": {"link": "/user/tommy"}}]
>     >
>     >          HTTP/1.1 200 OK
>     >          Version: "3"
>     >          Parents: "2"
>     >          Content-Type: application/json
>     >          Merge-Type: sync9
>     >          Content-Length: 117
>     >
>     >          [{"text": "Hi, everyone!",
>     >            "author": {"link": "/user/tommy"}}
>     >           {"text": "Yo!",
>     >            "author": {"link": "/user/yobot"}]
>     >
>     > This is backwards-compatible because it encodes multiple
>     responses into a regular response body that naïve intermediaries
>     will just pass along blindly, like SSE. But upgraded H2 and H3
>     implementations can have native headers & body frames that repeat.
>     It's all quite elegant. It fits right into HTTP. It feels as if
>     HTTP was designed to make it possible.
>
>     *every security analyst snaps around like hungry dogs to a steak*
>     Another request smuggling vector?
>
>     It's HTTP 1.1 where this looks easy. Even there it isn't. How does a
>     busy proxy with lots of internal connection reuse distinguish updates
>     as it passes them around on a multiplexed connection? What does this
>     look like for QUIC and H/3?
>
>     >
>     > We can add subscriptions to the basic fabric of HTTP, and free
>     application programmers from having to write cache-invalidation
>     logic. This will (a) eliminate bugs and code complexity; while
>     simultaneously (b) improving performance across the internet, and
>     (c) giving end-users the functionality of a realtime web by
>     default. This is a fundamental change, but it is overwhelming
>     beneficial. Then we can update Roy's dissertation. It's a good
>     one, and deserves our care.
>
>     We have (c): it's called WebSockets. What isn't it doing that it
>     should be? I'm sympathetic to fixing the foundations but there's lots
>     of complexity here that hasn't been addressed, and IMHO makes the
>     juice not worth the squeeze.
>     >
>     > Michael
>
>     Sincerely,
>     Watson
>     --
>     Astra mortemque praestare gradatim
>
>
>
> -- 
>
> ---
> *Josh Co*hen
>
Received on Sunday, 13 October 2024 01:33:02 UTC