Re: Review of draft-toomim-httpbis-versions-00

Hi Michael,

talking about performance, I am curious how it would perform in a
real-time, collaborative editing process (similar to Google Docs or the
note taker tool during the IETF meeting). To facilitate the real-time
aspects of the editing experience, would the client have to send a PUT
request after every few keystrokes, so that the changes appear quickly on
the peers' screens? Sending these requests is comparatively cheap for the
client, especially with HTTP/2 and HTTP/3, but potentially more costly for
the server, which has to perform authentication checks for each request and
then load the resource's state from some storage. If many requests are sent
in short succession, this can induce a higher load on the server. A
stateful connection, like with WebSockets, in contrast to stateless HTTP
requests could reuse the loaded and checked state - although such a method
likely has other caveats attached.

Overall my question is whether you think this draft is suitable to deliver
such real-time experiences in an efficient manner?

Best regards
Marius Kleidl

On Tue, Jul 23, 2024 at 1:51 AM Michael Toomim <toomim@gmail.com> wrote:

> Peter, I just wrote up an explicit example of how to compress four PUTs
> into 7 bytes. Check out the new section 5.1 here:
>
>
> https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L945
>
> These four puts compress down to 0.0146% of their original size, at least
> in theory. Note that said compression scheme isn't fully specified in this
> draft — the focus of this draft is just to gather interest in working on a
> versioning system that makes such compression possible. The actual
> compression schemes would be future work.
> On 7/22/24 12:41 PM, Michael Toomim wrote:
>
> Peter, thank you for your interest! I'm excited that you are bringing up
> performance for discussion! There's a lot to say on that, and I give an
> overview below:
>
> *== Compression & Performance ==*
>
> First, let me correct a big misinterpretation— this work absolutely
> prioritizes *high-performance*, *realtime* data synchronization. It
> should support thousands of mutations per second. Our implementations are
> higher-performance <https://josephg.com/blog/crdts-go-brrr/> than
> Automerge, for instance. I regularly work today with a doc composed of
> 110,000 edits. It loads instantly, thanks to some great Version-Types we've
> designed.
>
> The Version-Type (in the proposed Version-Type header) is the way you get
> performance increases. The key to performance is managing history growth.
> You manage that by finding a pattern in history, and then compressing or
> ignoring history. You can express those patterns as a Version-Type spec.
> (There's a robust theory behind this called Time Machines.)
>
> I apologize that this wasn't clear in the draft -00. I thought this would
> be an advanced feature that people wouldn't comment on for a bit — but am
> pleasantly surprised to hear your interest in it! I will be adding more
> clarity to the spec on Version-Types, and already have begun doing so in
> github:
>
>
> https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L885
>
> I'd also encourage you to check out this sketch of how to bake RLE into
> HTTP Header Compression:
>
> https://braid.org/meeting-69/header-compression
>
> https://braid.org/video/https://invisiblecollege.s3.us-west-1.amazonaws.com/braid-meeting-69.mp4#4166
>
> In any case, keep in mind that at this stage, we need to know only whether
> there is *interest* in this area of work — not whether this particular
> spec meets your needs. If we adopt this work into the HTTP WG, we will get
> a chance to change or rewrite any part of the spec. This spec is just a
> starting point to get discussion going. So think of this as a problem
> statement rather than a solution statement.
>
> *== PUTs ==*
>
> As for PUTs, I suspect you might be thinking about HTTP/1.0 where each PUT
> might require a new TCP connection with its own TLS handshake. But keep in
> mind that with HTTP/2 and 3, all HTTP semantics are expressed in binary,
> and a PUT is usually just a single packet! This is just as efficient as any
> hand-rolled protocol you have, and it has the advantage of being
> interoperable with all of HTTP.
>
>
> *== History Retention == *
>
> This versioning model supports Time Machines
> <https://braid.org/time-machines>— the beauty of which is that peers
> become free to independently choose how much history to store. An archival
> peer can store the full history. A light client can store just the latest
> version (see the amazing Simpleton <https://braid.org/simpleton> client,
> which needs zero history).
>
> So each peer can choose how much history to store. If a peer doesn't have
> enough history to merge an edit, it can simply request that history from
> another peer. In this draft, you do so by requesting a GET with both
> Version and Parents headers specified.
>
>
> *== Signatures & Validation == *
>
> This is out of scope for this proposal on versions. However, (a) there are
> some Version-Types that double as signatures. When this happens, it can be
> specified by authoring a Version-Type spec to articulate the new
> constraint. And (b) this is a generally important area of work that I
> encourage.
>
> Cheers!
>
> Michael
> On 7/22/24 11:44 AM, Michael Toomim wrote:
>
> We've got divergent discussion threads that I'm merging together.
>
> First, Peter Van Hardenberg (of Ink & Switch, Local-First, and Automerge)
> wrote this initial review of the draft. He's cc'd, and we can respond in
> this thread.
>
> ------------------------
> -- Peter Van Hardenberg: --
> ------------------------
>
> Hi Michael,
> I had a quick look at the spec and gave some thought to whether we'd want
> to adopt it. I think right now it has quite a lot of per-version overhead,
> and viewing this through a local-first lens, one can imagine having to
> publish a large number of versions each as separate PUT calls. You might
> want to consider supporting ranges for PUT in a single message.
>
> Overall, our goals appear to differ from what you're proposing here so
> this feedback may not be particularly important. My sense is that the
> expected granularity of changes for Braid is relatively large and that the
> frequency is relatively long -- on par with a changed HTML form submission,
> perhaps. We spend quite a lot of our time thinking about optimizing updates
> for potentially thousands of edits and trying to minimize the number of
> round trips required to synchronize state in both directions. You mention
> that the design intends to be optimizable but I didn't see much in the text
> that clarified how.
>
> One other observation is that this spec does not appear to prioritize
> retention of history:
> >      - If the Parents header is absent, the server SHOULD return a
> >      single response, containing the requested version of the resource
> >      in its body, with the Version response header set to the same
> >      version.
> This design may centralize the system, as clients default to receiving
> "flattened" versions of resources and thus may not be able to merge changes
> from other sources.
>
> Last, have you considered specifying some kind of signature / validation
> feature? If clients are applying patches iteratively, it might help for
> them to be able to validate that they're in the expected state either
> before or after applying a patch.
>
> All the best,
> -p
>
> On 7/15/24 6:26 PM, Michael Toomim wrote:
>
> Hi everyone in HTTP!
>
> Last fall we solicited feedback on the Braid State Synchronization
> proposal [draft
> <https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http-04>,
> slides
> <https://datatracker.ietf.org/meeting/118/materials/slides-118-httpbis-braid-http-add-synchronization-to-http-00>],
> which I'd summarize as:
>
> "We're enthusiastic about the general work, but the proposal is too
> high-level. Break the spec up into multiple independent specs, and work
> bottom-up. Focus on concrete 'bits-on-the-wire'."
>
> So I'm breaking the spec up, and have drafted up the first chunk for you.
> I would very much like your review on:
>
> *Versioning of HTTP Resources*
> draft-toomim-httpbis-versions
> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions-00
>
> Versioning is necessary for state synchronization—and occurs in a range of
> HTTP systems:
>
>    - Caching
>    - Archiving
>    - Version Control
>    - Collaborative Editing
>
> Today, HTTP has resource versions in the Last-Modified and ETag headers,
> and sometimes embeds versions in URLs, like with WebDAV. Each of these
> options serves some needs, but also has specific limitations. An improved
> general approach is proposed, which provides new features, that could
> enable cool new applications, such as incrementally-updated RSS feeds, and
> could simplify existing specifications, such as resumeable uploads, and
> history compression in OT/CRDT algorithms.
>
> I would love to know if people find this work interesting. I think we
> could improve performance, interoperability, and be one step closer to
> having Google Docs power within HTTP URLs.
>
> Michael
>
> -------- Forwarded Message --------
> Subject: New Version Notification for draft-toomim-httpbis-versions-00.txt
> Date: Mon, 08 Jul 2024 11:02:11 -0700
> From: internet-drafts@ietf.org
> To: Michael Toomim <toomim@gmail.com> <toomim@gmail.com>
>
> A new version of Internet-Draft draft-toomim-httpbis-versions-00.txt has
> been
> successfully submitted by Michael Toomim and posted to the
> IETF repository.
>
> Name: draft-toomim-httpbis-versions
> Revision: 00
> Title: HTTP Resource Versioning
> Date: 2024-07-08
> Group: Individual Submission
> Pages: 19
> URL: https://www.ietf.org/archive/id/draft-toomim-httpbis-versions-00.txt
> Status: https://datatracker.ietf.org/doc/draft-toomim-httpbis-versions/
> HTMLized:
> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions
>
>
> Abstract:
>
> HTTP resources change over time. Each change to a resource creates a
> new "version" of its state. HTTP systems often need a way to
> identify, read, write, navigate, and/or merge these versions, in
> order to implement cache consistency, create history archives, settle
> race conditions, request incremental updates to resources, interpret
> incremental updates to versions, or implement distributed
> collaborative editing algorithms.
>
> This document analyzes existing methods of versioning in HTTP,
> highlights limitations, and sketches a more general versioning
> approach that can enable new use-cases for HTTP.
>
>
>
> The IETF Secretariat
>
>
>

Received on Tuesday, 23 July 2024 08:45:47 UTC