- From: Michael Toomim <toomim@gmail.com>
- Date: Mon, 22 Jul 2024 16:49:37 -0700
- To: HTTP Working Group <ietf-http-wg@w3.org>, Braid <braid-http@googlegroups.com>, Peter van Hardenberg <pvh@pvh.ca>
- Message-ID: <ba9bd07d-b648-4afc-8c78-4ec05d2e1797@gmail.com>
Peter, I just wrote up an explicit example of how to compress four PUTs into 7 bytes. Check out the new section 5.1 here: https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L945 These four puts compress down to 0.0146% of their original size, at least in theory. Note that said compression scheme isn't fully specified in this draft — the focus of this draft is just to gather interest in working on a versioning system that makes such compression possible. The actual compression schemes would be future work. On 7/22/24 12:41 PM, Michael Toomim wrote: > > Peter, thank you for your interest! I'm excited that you are bringing > up performance for discussion! There's a lot to say on that, and I > give an overview below: > > *== Compression & Performance ==* > > First, let me correct a big misinterpretation— this work absolutely > prioritizes *high-performance*, *realtime* data synchronization. It > should support thousands of mutations per second. Our implementations > are higher-performance <https://josephg.com/blog/crdts-go-brrr/> than > Automerge, for instance. I regularly work today with a doc composed of > 110,000 edits. It loads instantly, thanks to some great Version-Types > we've designed. > > The Version-Type (in the proposed Version-Type header) is the way you > get performance increases. The key to performance is managing history > growth. You manage that by finding a pattern in history, and then > compressing or ignoring history. You can express those patterns as a > Version-Type spec. (There's a robust theory behind this called Time > Machines.) > > I apologize that this wasn't clear in the draft -00. I thought this > would be an advanced feature that people wouldn't comment on for a bit > — but am pleasantly surprised to hear your interest in it! I will be > adding more clarity to the spec on Version-Types, and already have > begun doing so in github: > > https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L885 > > I'd also encourage you to check out this sketch of how to bake RLE > into HTTP Header Compression: > > https://braid.org/meeting-69/header-compression > https://braid.org/video/https://invisiblecollege.s3.us-west-1.amazonaws.com/braid-meeting-69.mp4#4166 > > In any case, keep in mind that at this stage, we need to know only > whether there is /interest/ in this area of work — not whether this > particular spec meets your needs. If we adopt this work into the HTTP > WG, we will get a chance to change or rewrite any part of the spec. > This spec is just a starting point to get discussion going. So think > of this as a problem statement rather than a solution statement. > > *== PUTs ==* > > As for PUTs, I suspect you might be thinking about HTTP/1.0 where each > PUT might require a new TCP connection with its own TLS handshake. But > keep in mind that with HTTP/2 and 3, all HTTP semantics are expressed > in binary, and a PUT is usually just a single packet! This is just as > efficient as any hand-rolled protocol you have, and it has the > advantage of being interoperable with all of HTTP. > > *== History Retention == > * > > This versioning model supports Time Machines > <https://braid.org/time-machines>— the beauty of which is that peers > become free to independently choose how much history to store. An > archival peer can store the full history. A light client can store > just the latest version (see the amazing Simpleton > <https://braid.org/simpleton> client, which needs zero history). > > So each peer can choose how much history to store. If a peer doesn't > have enough history to merge an edit, it can simply request that > history from another peer. In this draft, you do so by requesting a > GET with both Version and Parents headers specified. > > *== Signatures & Validation == > * > > This is out of scope for this proposal on versions. However, (a) there > are some Version-Types that double as signatures. When this happens, > it can be specified by authoring a Version-Type spec to articulate the > new constraint. And (b) this is a generally important area of work > that I encourage. > > Cheers! > > Michael > > On 7/22/24 11:44 AM, Michael Toomim wrote: >> >> We've got divergent discussion threads that I'm merging together. >> >> First, Peter Van Hardenberg (of Ink & Switch, Local-First, and >> Automerge) wrote this initial review of the draft. He's cc'd, and we >> can respond in this thread. >> >> ------------------------ >> -- Peter Van Hardenberg: -- >> ------------------------ >> >> Hi Michael, >> >> I had a quick look at the spec and gave some thought to whether we'd >> want to adopt it. I think right now it has quite a lot of per-version >> overhead, and viewing this through a local-first lens, one can >> imagine having to publish a large number of versions each as separate >> PUT calls. You might want to consider supporting ranges for PUT in a >> single message. >> >> Overall, our goals appear to differ from what you're proposing here >> so this feedback may not be particularly important. My sense is that >> the expected granularity of changes for Braid is relatively large and >> that the frequency is relatively long -- on par with a changed HTML >> form submission, perhaps. We spend quite a lot of our time thinking >> about optimizing updates for potentially thousands of edits and >> trying to minimize the number of round trips required to synchronize >> state in both directions. You mention that the design intends to be >> optimizable but I didn't see much in the text that clarified how. >> >> One other observation is that this spec does not appear to prioritize >> retention of history: >> > - If the Parents header is absent, the server SHOULD return a >> > single response, containing the requested version of the resource >> > in its body, with the Version response header set to the same >> > version. >> This design may centralize the system, as clients default to >> receiving "flattened" versions of resources and thus may not be able >> to merge changes from other sources. >> >> Last, have you considered specifying some kind of signature / >> validation feature? If clients are applying patches iteratively, it >> might help for them to be able to validate that they're in the >> expected state either before or after applying a patch. >> >> All the best, >> -p >> >> On 7/15/24 6:26 PM, Michael Toomim wrote: >>> >>> Hi everyone in HTTP! >>> >>> Last fall we solicited feedback on the Braid State Synchronization >>> proposal [draft >>> <https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http-04>, >>> slides >>> <https://datatracker.ietf.org/meeting/118/materials/slides-118-httpbis-braid-http-add-synchronization-to-http-00>], >>> which I'd summarize as: >>> >>> "We're enthusiastic about the general work, but the proposal is >>> too high-level. Break the spec up into multiple independent >>> specs, and work bottom-up. Focus on concrete 'bits-on-the-wire'." >>> >>> So I'm breaking the spec up, and have drafted up the first chunk for >>> you. I would very much like your review on: >>> >>> *Versioning of HTTP Resources* >>> draft-toomim-httpbis-versions >>> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions-00 >>> >>> Versioning is necessary for state synchronization—and occurs in a >>> range of HTTP systems: >>> >>> * Caching >>> * Archiving >>> * Version Control >>> * Collaborative Editing >>> >>> Today, HTTP has resource versions in the Last-Modified and ETag >>> headers, and sometimes embeds versions in URLs, like with WebDAV. >>> Each of these options serves some needs, but also has specific >>> limitations. An improved general approach is proposed, which >>> provides new features, that could enable cool new applications, such >>> as incrementally-updated RSS feeds, and could simplify existing >>> specifications, such as resumeable uploads, and history compression >>> in OT/CRDT algorithms. >>> >>> I would love to know if people find this work interesting. I think >>> we could improve performance, interoperability, and be one step >>> closer to having Google Docs power within HTTP URLs. >>> >>> Michael >>> >>> -------- Forwarded Message -------- >>> Subject: New Version Notification for >>> draft-toomim-httpbis-versions-00.txt >>> Date: Mon, 08 Jul 2024 11:02:11 -0700 >>> From: internet-drafts@ietf.org >>> To: Michael Toomim <toomim@gmail.com> >>> >>> >>> >>> A new version of Internet-Draft draft-toomim-httpbis-versions-00.txt >>> has been >>> successfully submitted by Michael Toomim and posted to the >>> IETF repository. >>> >>> Name: draft-toomim-httpbis-versions >>> Revision: 00 >>> Title: HTTP Resource Versioning >>> Date: 2024-07-08 >>> Group: Individual Submission >>> Pages: 19 >>> URL: >>> https://www.ietf.org/archive/id/draft-toomim-httpbis-versions-00.txt >>> Status: https://datatracker.ietf.org/doc/draft-toomim-httpbis-versions/ >>> HTMLized: >>> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions >>> >>> >>> Abstract: >>> >>> HTTP resources change over time. Each change to a resource creates a >>> new "version" of its state. HTTP systems often need a way to >>> identify, read, write, navigate, and/or merge these versions, in >>> order to implement cache consistency, create history archives, settle >>> race conditions, request incremental updates to resources, interpret >>> incremental updates to versions, or implement distributed >>> collaborative editing algorithms. >>> >>> This document analyzes existing methods of versioning in HTTP, >>> highlights limitations, and sketches a more general versioning >>> approach that can enable new use-cases for HTTP. >>> >>> >>> >>> The IETF Secretariat >>> >>>
Received on Monday, 22 July 2024 23:49:45 UTC