Re: Proposal: Adopt State Synchronization into HTTPbis from Michael Toomim on 2024-10-11 (ietf-http-wg@w3.org from October to December 2024)

From: Michael Toomim <toomim@gmail.com>
Date: Thu, 10 Oct 2024 17:52:04 -0700
To: Josh Cohen <joshco@gmail.com>
Cc: Marius Kleidl <marius@transloadit.com>, Watson Ladd <watsonbladd@gmail.com>, ietf-http-wg@w3.org
Message-ID: <b36fdf2b-99ff-48bf-9ebe-b9f1027a38ff@gmail.com>
Josh, I think this breakdown into separate concerns is a great idea! 
Thank you!

I offer just one tweak here:

    At a high level, there are
    * The Versioning and Updates - the DAVish parts, which also have
    separable concerns.
    * PubSub

    Within PubSub, there are also separable concerns.  In my email
    comparing the different pubsub proposals[1], I described them as
    * Subscription Setup - The methods use to set up a subscription
    * Event Channel - How events are delivered to the subscriber
    * Event Payload - What the content of the events are.
    * Discovery - Discovery of PubSub features

The "Event Payload" itself actually *is* the "Versioning and Updates" 
part at top!

Thanks for helping to break this down. We need to find a bite-sized 
piece of this problem to break off, and adopt in the group. We could 
break off Subscriptions. Or we could break off Versioning. Both of these 
provide stand-alone value, and get us one step closer to a full-featured 
synchronization, where everything together.

On 10/9/24 3:29 PM, Josh Cohen wrote:
> I support adoption of this scope into httpbis. However, I think it's 
> useful to engage in separation of concerns.
>
> At a high level, there are
> * The Versioning and Updates - the DAVish parts, which also have 
> separable concerns.
> * PubSub
>
> Within PubSub, there are also separable concerns.  In my email 
> comparing the different pubsub proposals[1], I described them as
> * Subscription Setup - The methods use to set up a subscription
> * Event Channel - How events are delivered to the subscriber
> * Event Payload - What the content of the events are.
> * Discovery - Discovery of PubSub features
>
> If the WG adopts the work, then for each of these concerns, we can 
> converge on a solution (given the existing proposals), and possibly 
> allow for multiple choices, depending on circumstances.
>
> Multiresponse is just one possibility for the Event Channel concern.  
>  I share Watson's concerns, and worry that it may be too much of a 
> fundamental shift given deployed infrastructure. However, it seems 
> like we're getting wrapped around the axle on that.  There are other 
> options for the Event Channel.
>
> I'd like to put another on the table, which I will refer to as 
> "Symmetric HTTP"
>
> The hybi working group, which according to datatracker lived between 
> 2010 and 2015-ish.  It's charter says:
>
>     The BiDirectional or Server-Initiated HTTP (HyBi) working group
>     defines the WebSocket Protocol, a technology for bidirectional
>     communication between an HTTP client and an HTTP server that
>     provides greater efficiency than previous approaches (e.g., use of
>     hanging requests or long polling).
>
>
> The Websocket protocol RFC 6455, published in December 2011 says:
>
>     The WebSocket Protocol is designed to supersede existing
>     bidirectional communication technologies that use HTTP as a
>     transport layer to benefit from existing infrastructure (proxies,
>     filtering, authentication). Such technologies were implemented as
>     trade-offs between efficiency and reliability because HTTP was not
>     initially meant to be used for bidirectional communication (see
>     [RFC6202] for further discussion).
>
>
> The world has evolved, and now with h2/h3, we have the ability for the 
> server to initiate and open a stream to the client that has connected 
> to the server. If we assume that the client/browser can have a small 
> web server engine running, then the Event Channel can just be HTTP 
> requests (say using NOTIFY method) sent downstream from the server to 
> the client.
>
> Assume:
> We are interested in being notified about changes to resource 
> https://braid.org/@josh
> Assume the client's HTTP engine can be named urn:braid:peer/receiver
>
> The client can set up the subscription
> |SUBSCRIBE /@josh Callback: |urn:braid:peer/receiver
> When changes to the resource occur, the server can use h2/h3 to 
> initiate a stream downstream to the client and send NOTIFY messages to 
> the client
> |NOTIFY urn:braid:peer/receiver?+https://braid.org/@josh Content-Type 
> <https://braid.org/@josh Content-Type>: application/json 
> Content-Length: 64 - [{"text": "Hi, everyone!", "author": {"link": 
> "/user/tommy"}}]|
> This approach avoids the concerns Watson raised. Instead it just 
> "turns around" HTTP.   Instead of each application trying to define 
> its own marshaling scheme either in an endless streaming HTTP 
> response, or on top of WebSockets, it leverages the features of HTTP 
> we have.
>
> There is a naming concern with respect to resources on the client HTTP 
> engine and how to indicate what resource has changed.  That could be 
> in an HTTP header in the NOTIFY, something like the URN scheme I've 
> laid out, or something we come up with.
>
>
> [1] https://lists.w3.org/Archives/Public/ietf-http-wg/2024JulSep/0159.html
>
> On Wed, Oct 9, 2024 at 5:39 PM Michael Toomim <toomim@gmail.com> wrote:
>
>     Thank you, Marius! These are good questions about how to format a
>     Multiresponse:
>
>     On 10/9/24 12:20 AM, Marius Kleidl wrote:
>>     Regarding your example, Michael: Does the response body, which
>>     contains the updates, adapt its syntax to the used HTTP protocol?
>>     Do you suggest that subscriptions over HTTP/2 generate the
>>     updates as additional HTTP/2 responses? If so, this would require
>>     subscriptions to be implemented inside the HTTP client itself
>>     instead of being a feature that a user can implement based upon
>>     existing and available HTTP clients. In addition, this raises
>>     questions about how to handle situations where the used protocol
>>     changes as requests and responses are forwarded through proxies
>>     and gateways.
>
>     Yes, we would ideally format an H2 Multiresponse with native H2
>     frames. Here's an equivalent H2 version of my last H1 example:
>
>            ┌─────────┐
>            │ HEADERS │ :method = GET
>            │ Frame   │ :path = /chat
>            │         │ subscribe = timeout=10s
>            └─────────┘
>                 │
>                 ▼
>            ┌─────────┐
>            │ HEADERS │ :status = 104
>            │ Frame   │ subscribe = timeout=10s
>            │         │ current-version = "3"
>            └─────────┘
>                 │
>                 ▼
>            ┌─────────┐
>            │ HEADERS │ :status = 200
>            │ Frame   │ version = "2"
>            │         │ parents = "1a", "1b"
>            │         │ content-type = application/json
>            └─────────┘
>                 │
>                 ▼
>            ┌─────────┐
>            │  DATA   │ [{"text": "Hi, everyone!",
>            │ Frame   │   "author": {"link": "/user/tommy"}}]
>            └─────────┘
>                 │
>                 ▼
>            ┌─────────┐
>            │ HEADERS │ :status = 200
>            │ Frame   │ version = "3"
>            │         │ parents = "2"
>            │         │ content-type = application/json
>            │         │ merge-type = sync9
>            └─────────┘
>                 │
>                 ▼
>            ┌─────────┐
>            │  DATA   │ [{"text": "Hi, everyone!",
>            │ Frame   │   "author": {"link": "/user/tommy"}}
>            │         │  {"text": "Yo!",
>            │         │   "author": {"link": "/user/yobot"}]
>            └─────────┘
>
>     You're absolutely right that this native version requires the
>     client to be upgraded, and for proxies along the path to at least
>     not interfere. If they don't, we can always fall back to the
>     H1-style "shove it into the body" method. The trick is to know
>     when it's safe to upgrade. We're currently extending
>     cache-tests.fyi with some experiments to determine the best way to
>     do this.
>
>>     I also wonder if we could reuse multipart responses for
>>     delivering updates. The response to a subscription request would
>>     be a streamed multipart response where each "part" is one update.
>>     The update can include header fields as well as content, similar
>>     to your example. Status codes in the update would not be directly
>>     possible, but I'm not sure if that's a big loss.
>     Ah, yes this idea comes up frequently. The problem is that
>     multipart relies on boundary conditions, which can be spoofed.
>     Imagine an attacker learns that client C is getting updates
>     streamed with boundary separator "====foo-bar-baz====". He can
>     then try to find a way to mutate the resource in such a way to
>     include that boundary separator in an update being sent to the
>     client, and thus sneak fake data in.
>>     All in all, I enjoy the idea, but think that we can achieve this
>>     already with the existing features we have.
>
>     I'm glad. I enjoy the idea, too. We're re-using features where
>     possible. However, although SSE provides updates, it doesn't
>     provide the semantics of "the resource is changing state", and
>     doesn't even support binary, so it won't work for updating images.
>     Multipart is tempting to re-use in Multiresponses, but has the
>     boundary issue.
>
>     Michael
>
>
>
> -- 
>
> ---
> *Josh Co*hen
>
Received on Friday, 11 October 2024 00:52:11 UTC