Re: Review of draft-toomim-httpbis-versions-00

Yes, it works great for collaborative editing. I use it every day in 
production. It's very fast. We send a PUT per keystroke. I should show 
you a demo. It's real. :)

It's not true that an HTTP PUT induces more load on the server than a 
WebSocket message. They are equivalent. Consider that both H2 and 
WebSocket are TCP streams that stay persistently open. The only 
difference between these two streams is how the data is formatted. They 
don't impact how/when the server loads the resource from disk into ram. 
It's true that HTTP requests often contain a session ID in a cookie on 
each request, whereas a WebSocket might only send that when the user 
logs in/out, but that header gets compressed down with H2 header 
compression and isn't a significant performance problem.

Perhaps you're thinking about old-style threaded web servers? Those have 
a lot of overhead per request, because a 4mb OS thread has to be 
allocated to each request. But those don't support persistent 
connections (like WebSockets) at all. That's why everyone's moved to 
evented servers, like nodejs, which make persistent connections cheap, 
whether formatted as a WebSocket message stream or a H2 message stream.

On 7/23/24 1:45 AM, Marius Kleidl wrote:
> Hi Michael,
>
> talking about performance, I am curious how it would perform in a 
> real-time, collaborative editing process (similar to Google Docs or 
> the note taker tool during the IETF meeting). To facilitate the 
> real-time aspects of the editing experience, would the client have to 
> send a PUT request after every few keystrokes, so that the changes 
> appear quickly on the peers' screens? Sending these requests is 
> comparatively cheap for the client, especially with HTTP/2 and HTTP/3, 
> but potentially more costly for the server, which has to perform 
> authentication checks for each request and then load the resource's 
> state from some storage. If many requests are sent in short 
> succession, this can induce a higher load on the server. A stateful 
> connection, like with WebSockets, in contrast to stateless HTTP 
> requests could reuse the loaded and checked state - although such a 
> method likely has other caveats attached.
>
> Overall my question is whether you think this draft is suitable to 
> deliver such real-time experiences in an efficient manner?
>
> Best regards
> Marius Kleidl
>
> On Tue, Jul 23, 2024 at 1:51 AM Michael Toomim <toomim@gmail.com> wrote:
>
>     Peter, I just wrote up an explicit example of how to compress four
>     PUTs into 7 bytes. Check out the new section 5.1 here:
>
>         https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L945
>
>     These four puts compress down to 0.0146% of their original size,
>     at least in theory. Note that said compression scheme isn't fully
>     specified in this draft — the focus of this draft is just to
>     gather interest in working on a versioning system that makes such
>     compression possible. The actual compression schemes would be
>     future work.
>
>     On 7/22/24 12:41 PM, Michael Toomim wrote:
>>
>>     Peter, thank you for your interest! I'm excited that you are
>>     bringing up performance for discussion! There's a lot to say on
>>     that, and I give an overview below:
>>
>>     *== Compression & Performance ==*
>>
>>     First, let me correct a big misinterpretation— this work
>>     absolutely prioritizes *high-performance*, *realtime* data
>>     synchronization. It should support thousands of mutations per
>>     second. Our implementations are higher-performance
>>     <https://josephg.com/blog/crdts-go-brrr/> than Automerge, for
>>     instance. I regularly work today with a doc composed of 110,000
>>     edits. It loads instantly, thanks to some great Version-Types
>>     we've designed.
>>
>>     The Version-Type (in the proposed Version-Type header) is the way
>>     you get performance increases. The key to performance is managing
>>     history growth. You manage that by finding a pattern in history,
>>     and then compressing or ignoring history. You can express those
>>     patterns as a Version-Type spec. (There's a robust theory behind
>>     this called Time Machines.)
>>
>>     I apologize that this wasn't clear in the draft -00. I thought
>>     this would be an advanced feature that people wouldn't comment on
>>     for a bit — but am pleasantly surprised to hear your interest in
>>     it! I will be adding more clarity to the spec on Version-Types,
>>     and already have begun doing so in github:
>>
>>         https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L885
>>
>>     I'd also encourage you to check out this sketch of how to bake
>>     RLE into HTTP Header Compression:
>>
>>         https://braid.org/meeting-69/header-compression
>>         https://braid.org/video/https://invisiblecollege.s3.us-west-1.amazonaws.com/braid-meeting-69.mp4#4166
>>
>>     In any case, keep in mind that at this stage, we need to know
>>     only whether there is /interest/ in this area of work — not
>>     whether this particular spec meets your needs. If we adopt this
>>     work into the HTTP WG, we will get a chance to change or rewrite
>>     any part of the spec. This spec is just a starting point to get
>>     discussion going. So think of this as a problem statement rather
>>     than a solution statement.
>>
>>     *== PUTs ==*
>>
>>     As for PUTs, I suspect you might be thinking about HTTP/1.0 where
>>     each PUT might require a new TCP connection with its own TLS
>>     handshake. But keep in mind that with HTTP/2 and 3, all HTTP
>>     semantics are expressed in binary, and a PUT is usually just a
>>     single packet! This is just as efficient as any hand-rolled
>>     protocol you have, and it has the advantage of being
>>     interoperable with all of HTTP.
>>
>>     *== History Retention ==
>>     *
>>
>>     This versioning model supports Time Machines
>>     <https://braid.org/time-machines>— the beauty of which is that
>>     peers become free to independently choose how much history to
>>     store. An archival peer can store the full history. A light
>>     client can store just the latest version (see the amazing
>>     Simpleton <https://braid.org/simpleton> client, which needs zero
>>     history).
>>
>>     So each peer can choose how much history to store. If a peer
>>     doesn't have enough history to merge an edit, it can simply
>>     request that history from another peer. In this draft, you do so
>>     by requesting a GET with both Version and Parents headers specified.
>>
>>     *== Signatures & Validation ==
>>     *
>>
>>     This is out of scope for this proposal on versions. However, (a)
>>     there are some Version-Types that double as signatures. When this
>>     happens, it can be specified by authoring a Version-Type spec to
>>     articulate the new constraint. And (b) this is a generally
>>     important area of work that I encourage.
>>
>>     Cheers!
>>
>>     Michael
>>
>>     On 7/22/24 11:44 AM, Michael Toomim wrote:
>>>
>>>     We've got divergent discussion threads that I'm merging together.
>>>
>>>     First, Peter Van Hardenberg (of Ink & Switch, Local-First, and
>>>     Automerge) wrote this initial review of the draft. He's cc'd,
>>>     and we can respond in this thread.
>>>
>>>     ------------------------
>>>     -- Peter Van Hardenberg: --
>>>     ------------------------
>>>
>>>     Hi Michael,
>>>
>>>     I had a quick look at the spec and gave some thought to whether
>>>     we'd want to adopt it. I think right now it has quite a lot of
>>>     per-version overhead, and viewing this through a local-first
>>>     lens, one can imagine having to publish a large number of
>>>     versions each as separate PUT calls. You might want to consider
>>>     supporting ranges for PUT in a single message.
>>>
>>>     Overall, our goals appear to differ from what you're proposing
>>>     here so this feedback may not be particularly important. My
>>>     sense is that the expected granularity of changes for Braid is
>>>     relatively large and that the frequency is relatively long -- on
>>>     par with a changed HTML form submission, perhaps. We spend quite
>>>     a lot of our time thinking about optimizing updates for
>>>     potentially thousands of edits and trying to minimize the number
>>>     of round trips required to synchronize state in both directions.
>>>     You mention that the design intends to be optimizable but I
>>>     didn't see much in the text that clarified how.
>>>
>>>     One other observation is that this spec does not appear to
>>>     prioritize retention of history:
>>>     >      - If the Parents header is absent, the server SHOULD return a
>>>     >      single response, containing the requested version of the
>>>     resource
>>>     >      in its body, with the Version response header set to the same
>>>     >      version.
>>>     This design may centralize the system, as clients default to
>>>     receiving "flattened" versions of resources and thus may not be
>>>     able to merge changes from other sources.
>>>
>>>     Last, have you considered specifying some kind of signature /
>>>     validation feature? If clients are applying patches iteratively,
>>>     it might help for them to be able to validate that they're in
>>>     the expected state either before or after applying a patch.
>>>
>>>     All the best,
>>>     -p
>>>
>>>     On 7/15/24 6:26 PM, Michael Toomim wrote:
>>>>
>>>>     Hi everyone in HTTP!
>>>>
>>>>     Last fall we solicited feedback on the Braid State
>>>>     Synchronization proposal [draft
>>>>     <https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http-04>,
>>>>     slides
>>>>     <https://datatracker.ietf.org/meeting/118/materials/slides-118-httpbis-braid-http-add-synchronization-to-http-00>],
>>>>     which I'd summarize as:
>>>>
>>>>         "We're enthusiastic about the general work, but the
>>>>         proposal is too high-level. Break the spec up into multiple
>>>>         independent specs, and work bottom-up. Focus on concrete
>>>>         'bits-on-the-wire'."
>>>>
>>>>     So I'm breaking the spec up, and have drafted up the first
>>>>     chunk for you. I would very much like your review on:
>>>>
>>>>         *Versioning of HTTP Resources*
>>>>         draft-toomim-httpbis-versions
>>>>         https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions-00
>>>>
>>>>     Versioning is necessary for state synchronization—and occurs in
>>>>     a range of HTTP systems:
>>>>
>>>>       * Caching
>>>>       * Archiving
>>>>       * Version Control
>>>>       * Collaborative Editing
>>>>
>>>>     Today, HTTP has resource versions in the Last-Modified and ETag
>>>>     headers, and sometimes embeds versions in URLs, like with
>>>>     WebDAV. Each of these options serves some needs, but also has
>>>>     specific limitations. An improved general approach is proposed,
>>>>     which provides new features, that could enable cool new
>>>>     applications, such as incrementally-updated RSS feeds, and
>>>>     could simplify existing specifications, such as resumeable
>>>>     uploads, and history compression in OT/CRDT algorithms.
>>>>
>>>>     I would love to know if people find this work interesting. I
>>>>     think we could improve performance, interoperability, and be
>>>>     one step closer to having Google Docs power within HTTP URLs.
>>>>
>>>>     Michael
>>>>
>>>>     -------- Forwarded Message --------
>>>>     Subject:  New Version Notification for
>>>>     draft-toomim-httpbis-versions-00.txt
>>>>     Date:  Mon, 08 Jul 2024 11:02:11 -0700
>>>>     From:  internet-drafts@ietf.org
>>>>     To:  Michael Toomim <toomim@gmail.com> <mailto:toomim@gmail.com>
>>>>
>>>>
>>>>
>>>>     A new version of Internet-Draft
>>>>     draft-toomim-httpbis-versions-00.txt has been
>>>>     successfully submitted by Michael Toomim and posted to the
>>>>     IETF repository.
>>>>
>>>>     Name: draft-toomim-httpbis-versions
>>>>     Revision: 00
>>>>     Title: HTTP Resource Versioning
>>>>     Date: 2024-07-08
>>>>     Group: Individual Submission
>>>>     Pages: 19
>>>>     URL:
>>>>     https://www.ietf.org/archive/id/draft-toomim-httpbis-versions-00.txt
>>>>     Status:
>>>>     https://datatracker.ietf.org/doc/draft-toomim-httpbis-versions/
>>>>     HTMLized:
>>>>     https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions
>>>>
>>>>
>>>>     Abstract:
>>>>
>>>>     HTTP resources change over time. Each change to a resource
>>>>     creates a
>>>>     new "version" of its state. HTTP systems often need a way to
>>>>     identify, read, write, navigate, and/or merge these versions, in
>>>>     order to implement cache consistency, create history archives,
>>>>     settle
>>>>     race conditions, request incremental updates to resources,
>>>>     interpret
>>>>     incremental updates to versions, or implement distributed
>>>>     collaborative editing algorithms.
>>>>
>>>>     This document analyzes existing methods of versioning in HTTP,
>>>>     highlights limitations, and sketches a more general versioning
>>>>     approach that can enable new use-cases for HTTP.
>>>>
>>>>
>>>>
>>>>     The IETF Secretariat
>>>>
>>>>

Received on Tuesday, 23 July 2024 19:25:25 UTC