Re: Proposal: Adopt State Synchronization into HTTPbis from Rahul Gupta on 2024-10-15 (ietf-http-wg@w3.org from October to December 2024)

From: Rahul Gupta <cxres@protonmail.com>
Date: Tue, 15 Oct 2024 10:09:21 +0000
To: Michael Toomim <toomim@gmail.com>
Cc: Josh Cohen <joshco@gmail.com>, Watson Ladd <watsonbladd@gmail.com>, ietf-http-wg@w3.org
Message-ID: <fIKWDFWZsjNqjBsrXNd79RIAwHFqyOzhKLXyXg_20mxmfXpq4boM12jrNM86kzwEWdyKnrAXVwSxh20>
Hello Notifications Enthusiasts,


As the author of the only active I-D on notifications called Per Resource Events https://datatracker.ietf.org/doc/draft-gupta-httpbis-per-resource-events/ which was also the catalyst for the renewal of interest in notification protocols built on HTTP in the last year, I thought I should respond to the sheer amount of activity on this thread over the previous week.
The fact that various people have taken stabs at this problem now for 25 years at a minimum serves as social-proof for the interest in using HTTP to serve notifications. There are many compelling advantages to this approach, especially from the perspective of a client. To buttress Josh and Mike’s arguments, the fact that you do not need to change transports, means the client is free from the overhead of coordinating responses. PREP by serving the representation and notifications in the same response also gets around more subtle time co-ordination issues. All this lends to a very friendly API, see for example prep-fetch https://github.com/CxRes/prep-fetch, where the Fetch Response can be broken up into little Responses (identical to the Fetch Response) for each notification which can be easily consumed through async iteration.

const response = fetch(‘http://example.com’, {
  headers: {“Accept-events”: ‘”prep”’}
})
const prepResponse = prepFetch(response);
const representation = await prepResponse.getRepresentation();
  // do something with your representation
const notifications = await prepResponse.getNotifications();
for await (const notification of notifications) {
  // do something with a notification
}

The ability to consume (representation and) notifications with just half a dozen lines of code is for clients rather compelling. (I might be wrong, but I am actually not sure if browsers even need to be modified here at all).

Let me take the opportunity to thank Josh Cohen for his comparison of proposals over the years. I have unfortunately not been able to study Josh’s proposal in any depth to comment on it. But on the point of comparison, I had put up a list of necessary (but not sufficient) questions that need to be addressed in the design of a notifications protocol on the Braid issue tracker https://github.com/braid-org/braid-spec/issues/124#issuecomment-1690538558. You might find these to be more comprehensive than the categories Josh has identified, though I very much appreciate the nomenclature he has introduced. Working through those questions help to meaningfully constrain the design space.

Much like Josh, I greatly admire the work on synchronization undertaken by Mike Toomim and his collaborators at Braid. Mike had very warmly welcomed me to the Braid community and I participated in meetings regularly over the winter to understand their work until they were moved to the middle of the night in my time zone that prevented me from further participation. And I definitely support his mission and have done my best to promote the work of the Braid community over at Solid.

But when it comes to the question of how to actually put bits on the wire, we have unfortunately come to an impasse. We have fundamentally different notions of what a notification is for. Mike over a period of time in discussions, both private and public, has asked me to drop every significant feature of PREP, to the point where it seems I might as well gut my draft entirely. A more polite summary of his objections and my responses can be found at https://github.com/CxRes/prep/issues/5. I would even consider adopting his proposals on notifications, but I find that they are not rigorous in terms of addressing the questions I linked above, sometimes inconsistent with HTTP and a moving target.

I’ll just illustrate this with one example, that of 104 MultiResponse proposal such as the one linked in the 4th mail in this thread. A plain reading of Section 15 of RFC9110 shows that this proposal is untenable, “A single request can have multiple associated responses: zero or more interim (non-final) responses with status codes in the "informational" (1xx) range, followed by exactly one final response with a status code in one of the other ranges.” In that mail, Mike states “This is backwards-compatible because it encodes multiple responses into a regular response body”. 1xx responses cannot have a body. Even if we disregard that and say it is a 251 response, the collection of 2xx responses that follow need to have a media-type specified. I have already pointed this out, such as here, https://github.com/CxRes/prep/issues/5#issuecomment-2253788072. Unfortunately, discussions in these circumstances become extremely tenuous and consensus is unreasonable to expect.  

While I have some practical reasons to not like Multipart media type as well (but not for the reasons Mike opposes it), it is a well established and widely used general purpose composite media-type (one of only two afaik). For this reason, I had preferred it over other solutions in PREP, as Marius suggests in this thread. As I have stated elsewhere, I am happy to evaluate other alternatives, provided they are consistent with HTTP Semantics.

On this matter, the authors of various HTTP specifications might be of some help, by describing how application/http media type works, especially how, in different HTTP versions, messages are delineated inside this media type. I have not found a satisfactory answer, including when I asked about it on this mailing list https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0195.html

===================

Re: long-polling headers, PREP specification introduces a request-response header pair to negotiate notifications which can be easily parametrized.

Re: power requirements (speculating with my electrical engineering hat on):
Greater deployment of real-time updates/synchronization is inevitable. I personally do not see the entire webpages/apps becoming dynamic but a steady progression to where most of the information rendered is being updated in real-time (SSR folks might disagree, but I sit in the decentralization camp!). A HTTP based protocol, especially where intermediaries can fan out notifications, will on average consume less power than direct connections between user-agents and origin servers (and possibly? be more reliable as well in the aggregate). That, in my book, is itself a win. Certainly at the level of device or a router, they will consume more power than solutions like WebPush, but a HTTP based notification protocol is not necessarily competing with that very specific use-case. Case in point, we are already seeing a diversity of transports being used with Solid Notifications Protocol serving very different requirements.

===================

Going forward, I see these bright line concerns for designing a baseline (see point 2) notification protocols on HTTP:

1. The protocol MUST be consistent with HTTP: It should go without saying but a baseline notification protocol has to be consistent with the specification. I would also prefer that it minimally extend HTTP and with minimum upgrades to deployed infrastructure. PREP, I feel, attempts to be standards compliant and minimally disruptive.

2. The ability to negotiate different notification protocols: I, for one, do not believe there is a one true protocol; and that we must have the humility to accommodate different solutions (a baseline protocol means subsequent solutions will also have more opportunity to experiment). Negotiation of protocols (whether explicit or automatic) also opens up the possibility to craft solutions that are (to Mark’s point) specific to H2/H3 that can take advantage of native framing, and provide a graceful downgrade path for HTTP/1.1 (which imho is better than not having a notifications solution for HTTP/1.1 at all).

3. The ability negotiate the media-type of notifications: HTTP has the unique feature (preaching to the choir here) that the state of information objects or resources, can only be observed though their representations. It is only natural that a notifications protocol on HTTP should extend this capability temporally. When space and time are taken together, we observe events (and not states).  Notifications are the representation of some combination of event and resulting state (this modifies my previous definition) that the client should be able to negotiate according to their informational needs. It need not carry all the information about the event, if the purposes is only to inform, not synchronize. For example, a PATCH request may contain identifiers about a person being added as a friend, but the resulting notification may only carry their name. This is not a just theoretical concern but lends to notifications that serve the broadest class of use-cases by preseving the decoupling between server and client that is built into HTTP.
If servers transfer representations of events/computations as they occur so as to allow for synchronization between server and client, I would call that Representational Event Synchronization (RES or RESync, not yet trademarked!). Representational State Synchronization to me would mean that servers transfer representation of states, server effectively doing the state synchronization for the client. I think what Mike describes as ReSS is mostly RESync + Local State Synchronization using the transferred events and an initial state, performed on the client with CRDT/OT algorithms. ReSS may occasionally occur for performance/consistency reasons.

If the HTTPWG is so inclined to consider notifications as a work item, I would kindly request the chairs to bring up these issues in the upcoming httpbis session for discussion. If it is the case these are not acceptable to the community, then I shall withdraw my proposal as I believe that we will then lose the essential character of HTTP. Within these bounds, I am amenable to discussing various alternatives.

===================

Let me take this opportunity to remind folks that Per Resource Events is an active I-D and now comes with a full set of JavaScript/NodeJS libraries for both servers and clients (see the implementation section of the draft for details). However, it pains me to say that the very nature of IETF meetings puts PREP at a structural disadvantage. The fact that it far more difficult for me than the other participants to be there in-person means that I have fewer opportunities for serendipitous exchanges, fewer possibilities to publicize the specification and thus get it in the hands of a larger audience to play with. This is not even about my work, for I may not have the best ideas, but ensuring that HTTPWG and IETF are in a position to capture the best ideas where ever they come from. Having said that, I am grateful for the opportunity that IETF and art/httpbis have given me in the past year.


Best Regards,
Rahul


On Sunday, October 13th, 2024 at 7:02 AM, Michael Toomim <toomim@gmail.com> wrote:

> Thanks, Josh, for sharing this context!
> 

> You can see some of what Josh is talking about in this demo video:
> 

> > • Early Demo of Braid-Chrome (from Braid Meeting 75)
> 

> Josh also brought up how this simplifies Javascript programming. It's true. I now program with local Javascript variables that are backed directly by HTTP:
> 

> > var render = () =>
> > return `<H2>The Current Bitcoin Price is:</h2>
> > ${state['https://bitcoin.org/current_price']}`
> 

> This function automatically subscribes to the state['https://bitcoin.org/current_price'] variable, which is an ES6 proxy, that automatically runs a GET Subscribe: HTTP request, which subscribes to the bitcoin price at the server. And then each update from the server automatically chains back to the UI and re-runs the render function.
> 

> As soon as the widget stops rendering, it unsubscribes from the ES6 Proxy, which unsubscribes in turn from the HTTP endpoint. The programmer doesn't have to write networking code anymore.
> 

> The programmer can mutate state, too:
> 

> > state['https://braid.org/chat'].push({message: "Hi guys!", author: "/user/mike"})
> 

> And the programming runtime abstracts this remote state cleanly behind a local variable, which the programmer can assume is already downloaded, and always up-to-date. This works because we've perfected the underlying sync model. The programmer can rely on it. Then he can program at a higher level of abstraction. He doesn't have to write network code anymore. He just reads and writes it as state, as a local variable.
> 

> And while this *supports* perfect sync algorithms (e.g. CRDT), it does not *depend* on them. 99% of applications don't need CRDT consistency guarantees. 80% work fine with long-polling. But by standardizing all of these mechanisms as "change" over standard HTTP semantics, they become interoperable with one another, and the application programmer can read and write all of them as local variables, without having to care how the state's particular server has decided to serve it up. (Some of the variables will just be more robust than others.)
> 

> Michael
> 

> On 10/12/24 10:40 AM, Josh Cohen wrote:
> 

> > Watson said: "I don't really understand the class of applications for which this is useful. Some like chat programs/multiuser editors I get: this would be a neat way to get the state of the room."
> > I met Michael in Vancouver, on coffee break before the httpwg meeting. He told me he was going to decentralize the web and make it peer to peer. I thought to myself "omg, everyone says that" However, Michael and the Braid team bring it. I attended a few Braid meetings to see what they were up to, and it's pretty cool. They've built tooling libraries already. One benefit that stood out to me is the ability to register javascript event handlers on local JavaScript variables that are synchronized to server state. The developer can set an event handler on a variable, and when it changes on the server, their callback runs. This raises the level of abstraction that the web developer works at.
> > 

> > In one of my apps, I have a settings object that can be changed on the client or the server and is synchronized. That's similar to getting the state of the chat room. My implementation is a websocket/redis contraption, which was fun to do, but also kind of a hassle. By integrating it into the platform there is a common way to do this, especially from the protocol perspective. JS frameworks can run wild with it.
> > 

> > I'm separating concerns here, multiresponse is just one possible option for the Event Channel part. Middleboxes could optimize for fan out. If a middlebox sees a subscription request for a subscription it already has, has cached the notifications and is up to date, then it can just tell the origin server that it has an additional subscription, and can fan out responses to connected clients.
> > 

> > In the Symmetric HTTP case, this seems straightforward to reason about. Since SUBSCRIBE or SUB is its own HTTP request, the middlebox can recognize this, pay attention and save the subscription parameters. The NOTIFY or PUB method request and entity body are just cached like other entity bodies. So a late subscriber could be caught up by the middlebox, and existing subscribers get fanned out PUB http requests.
> > 

> > 

> > 

> > On Wed, Oct 9, 2024 at 12:55 PM Watson Ladd <watsonbladd@gmail.com> wrote:
> > 

> > > On Tue, Oct 8, 2024 at 4:16 PM Michael Toomim <toomim@gmail.com> wrote:
> > > >
> > > > Josh Cohen and I are considering publishing a new draft on Subscriptions in HTTP, and as we think through the big design decisions, I ran across this excellent question from Watson Ladd bringing up the most fundamental question of them all:
> > > >
> > > > On 11/6/23 4:47 PM, Watson Ladd wrote:
> > > >
> > > > On Tue, Oct 31, 2023 at 7:12 PM Michael Toomim <toomim@gmail.com> wrote:
> > > >
> > > > At IETF 118 I will present a proposal to adopt State Synchronization work into HTTPbis:
> > > >
> > > > Braid-HTTP: https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http [1]
> > > >
> > > > <...snip...>
> > > >
> > > > The big sticking point for me is subscriptions. This is a deviation
> > > > from the request/response paradigm that goes pretty deep into how
> > > > clients and servers are coded and the libraries they use. It can of
> > > > course be stuck on top of WebTransport, which might be the right way
> > > > to do it, but then doesn't integrate with the other three parts.
> > > >
> > > > You might be better trying to layer this on top of HTTP and
> > > > WebTransport, as ugly as that can be with regard to what
> > > > intermediaries can do in order to get it into the hands of people
> > > > faster, but if there's some strong reason not to do that I'm all ears.
> > > >
> > > > Watson raises a basic choice in designing Subscriptions:
> > > >
> > > > Do we dare extend the basic request/response model to allow long-lived subscriptions?
> > > > Or are subscriptions better layered on top— inside a WebSocket, or WebTransport connection?
> > > >
> > > > I argue that (1) is actually a *much* better choice. Yes, it is fundamental. It extends the basic architecture of HTTP (hat tip to Roy Fielding) by extending REST into RESS (REpresentational State Synchronization). It adds a new basic feature to HTTP— the ability to subscribe to any resource, and get notified of its changes over time; throughout the entire web and HTTP ecosystem. Clients will stop guessing whether to reload cache, and will stop making redundant requests. Servers—which *authoritatively know* when resources change—will promise to tell clients, automatically, and optimally. Terabytes of bandwidth will be saved. Millions of lines of cache invalidation logic will be eliminated. Quadrillions of dirty-cache bugs will disappear. In time, web browser "reload" buttons will become obsolete, across the face of earth.
> > > 

> > > That assumes deployment and that this works pretty universally. I'm
> > > less sanguine about the odds of success.
> > > 

> > > This happy state holds after every single intermediary and HTTP
> > > library is modified to change a basic request-response invariant. Most
> > > won't be. In any protocol this age there are some invariants that have
> > > crept in and become ossified and for HTTP 1/1 that's the request
> > > response model. Pipelining doesn't necessarily work all the time, let
> > > alone the interleaving you need to be efficient with TCP sockets and
> > > this. Servers have a deeply ingrained idea that they don't need to
> > > hold long lived resources for a request. It's going to be hard to
> > > change that, and some assets will change meaningfully for clients
> > > outside of the duration of a TCP connection (think e.g. NAT, etc).
> > > 

> > > Caches, particularly client caches, outlive processes.
> > > 

> > > Subscriptions are push based, HTTP requests are pull based. Pulls
> > > scale better: clients can do a distributed backoff, understand that
> > > they are missing information, recover from losing it. Push might be
> > > faster in the happy case, but it is complex to do right. The cache
> > > invalidation logic remains: determining a new version must be pushed
> > > to clients is the same as saying "oh, we must clear caches because
> > > front.jpg changed". We already have a lot of cache control and HEAD to
> > > try to prevent large transfers of unchanged information. A
> > > subscription might reduce some of this, but when the subscription
> > > stops, the client has to check back in, which is just as expensive as
> > > a HEAD.
> > > 

> > > >
> > > > The alternative (2) is to add subscriptions on top of a WebSocket or WebTransport API, separate from HTTP resource semantics. But then HTTP resources themselves will not be subscribable. Programmers will be annoyed, and limit their use of HTTP to bootstrapping the initial HTML, CSS and Javascript; migrating all the interesting state onto this separate WebSocket or WebTransport mechanism, which will then require more and more of HTTP's features being added back into it: like (a) being able to GET the state, and also PUT changes to the it, over (b) multiple content-types (e.g. text/json, text/html, and image/png), while (c) supporting various PATCH types, across (d) a full-featured network of Proxies, Caches, and CDNs to scale the network. In conclusion, choice (2) leads to reinventing HTTP, within a WebSocket/Transport... on top of HTTP.
> > > 

> > > I don't really understand the class of applications for which this is
> > > useful. Some like chat programs/multiuser editors I get: this would be
> > > a neat way to get the state of the room. It also isn't clear to me
> > > that intermediaries can do anything on seeing a PATCH propagating up
> > > or a PUT: still has to go to the application to determine what the
> > > impact of the change to the state is.
> > > 

> > > >
> > > > The clear way forward is subscribing directly to HTTP and REST state. An elegant way to see this is extending Request/Response into Request/Multiresponse. The Subscription can be a Request that receives multiple Responses, one for each update to the resource. There are many ways to format a Multiresponse; here's a straightforward and backwards-compatible option:
> > > >
> > > > Request:
> > > >
> > > > GET /chat
> > > > Subscribe: timeout=10s
> > > >
> > > > Response:
> > > >
> > > > HTTP/1.1 104 Multiresponse
> > > > Subscribe: timeout=10s
> > > > Current-Version: "3"
> > > >
> > > > HTTP/1.1 200 OK
> > > > Version: "2"
> > > > Parents: "1a", "1b"
> > > > Content-Type: application/json
> > > > Content-Length: 64
> > > >
> > > > [{"text": "Hi, everyone!",
> > > > "author": {"link": "/user/tommy"}}]
> > > >
> > > > HTTP/1.1 200 OK
> > > > Version: "3"
> > > > Parents: "2"
> > > > Content-Type: application/json
> > > > Merge-Type: sync9
> > > > Content-Length: 117
> > > >
> > > > [{"text": "Hi, everyone!",
> > > > "author": {"link": "/user/tommy"}}
> > > > {"text": "Yo!",
> > > > "author": {"link": "/user/yobot"}]
> > > >
> > > > This is backwards-compatible because it encodes multiple responses into a regular response body that naïve intermediaries will just pass along blindly, like SSE. But upgraded H2 and H3 implementations can have native headers & body frames that repeat. It's all quite elegant. It fits right into HTTP. It feels as if HTTP was designed to make it possible.
> > > 

> > > *every security analyst snaps around like hungry dogs to a steak*
> > > Another request smuggling vector?
> > > 

> > > It's HTTP 1.1 where this looks easy. Even there it isn't. How does a
> > > busy proxy with lots of internal connection reuse distinguish updates
> > > as it passes them around on a multiplexed connection? What does this
> > > look like for QUIC and H/3?
> > > 

> > > >
> > > > We can add subscriptions to the basic fabric of HTTP, and free application programmers from having to write cache-invalidation logic. This will (a) eliminate bugs and code complexity; while simultaneously (b) improving performance across the internet, and (c) giving end-users the functionality of a realtime web by default. This is a fundamental change, but it is overwhelming beneficial. Then we can update Roy's dissertation. It's a good one, and deserves our care.
> > > 

> > > We have (c): it's called WebSockets. What isn't it doing that it
> > > should be? I'm sympathetic to fixing the foundations but there's lots
> > > of complexity here that hasn't been addressed, and IMHO makes the
> > > juice not worth the squeeze.
> > > >
> > > > Michael
> > > 

> > > Sincerely,
> > > Watson
> > > --
> > > Astra mortemque praestare gradatim
> > 

> > 

> > 

> > --
> > 

> > ---
> > Josh Cohen
Attachments

application/pgp-keys attachment: publickey_-_cxres_protonmail.com_-_0x0CEC7748.asc
Received on Tuesday, 15 October 2024 10:09:33 UTC