- From: Michael Toomim <toomim@gmail.com>
- Date: Thu, 25 Jul 2024 03:23:56 -0700
- To: HTTP Working Group <ietf-http-wg@w3.org>, Braid <braid-http@googlegroups.com>, Peter van Hardenberg <pvh@pvh.ca>, Martin Kleppmann <martin@kleppmann.com>
- Message-ID: <f04e6822-e49a-430f-a605-6547f20b96d6@gmail.com>
Peter and Martin, I've hit "publish" on the explanation for how to compress history with Version-Types: https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions#section-5.1 You can simply review these sections (5.1 and 5.2) instead of the long list of links below. Does this address your concerns? Thanks, Michael On 7/22/24 4:49 PM, Michael Toomim wrote: > > Peter, I just wrote up an explicit example of how to compress four > PUTs into 7 bytes. Check out the new section 5.1 here: > > https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L945 > > These four puts compress down to 0.0146% of their original size, at > least in theory. Note that said compression scheme isn't fully > specified in this draft — the focus of this draft is just to gather > interest in working on a versioning system that makes such compression > possible. The actual compression schemes would be future work. > > On 7/22/24 12:41 PM, Michael Toomim wrote: >> >> Peter, thank you for your interest! I'm excited that you are bringing >> up performance for discussion! There's a lot to say on that, and I >> give an overview below: >> >> *== Compression & Performance ==* >> >> First, let me correct a big misinterpretation— this work absolutely >> prioritizes *high-performance*, *realtime* data synchronization. It >> should support thousands of mutations per second. Our implementations >> are higher-performance <https://josephg.com/blog/crdts-go-brrr/> than >> Automerge, for instance. I regularly work today with a doc composed >> of 110,000 edits. It loads instantly, thanks to some great >> Version-Types we've designed. >> >> The Version-Type (in the proposed Version-Type header) is the way you >> get performance increases. The key to performance is managing history >> growth. You manage that by finding a pattern in history, and then >> compressing or ignoring history. You can express those patterns as a >> Version-Type spec. (There's a robust theory behind this called Time >> Machines.) >> >> I apologize that this wasn't clear in the draft -00. I thought this >> would be an advanced feature that people wouldn't comment on for a >> bit — but am pleasantly surprised to hear your interest in it! I will >> be adding more clarity to the spec on Version-Types, and already have >> begun doing so in github: >> >> https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L885 >> >> I'd also encourage you to check out this sketch of how to bake RLE >> into HTTP Header Compression: >> >> https://braid.org/meeting-69/header-compression >> https://braid.org/video/https://invisiblecollege.s3.us-west-1.amazonaws.com/braid-meeting-69.mp4#4166 >> >> In any case, keep in mind that at this stage, we need to know only >> whether there is /interest/ in this area of work — not whether this >> particular spec meets your needs. If we adopt this work into the HTTP >> WG, we will get a chance to change or rewrite any part of the spec. >> This spec is just a starting point to get discussion going. So think >> of this as a problem statement rather than a solution statement. >> >> *== PUTs ==* >> >> As for PUTs, I suspect you might be thinking about HTTP/1.0 where >> each PUT might require a new TCP connection with its own TLS >> handshake. But keep in mind that with HTTP/2 and 3, all HTTP >> semantics are expressed in binary, and a PUT is usually just a single >> packet! This is just as efficient as any hand-rolled protocol you >> have, and it has the advantage of being interoperable with all of HTTP. >> >> *== History Retention == >> * >> >> This versioning model supports Time Machines >> <https://braid.org/time-machines>— the beauty of which is that peers >> become free to independently choose how much history to store. An >> archival peer can store the full history. A light client can store >> just the latest version (see the amazing Simpleton >> <https://braid.org/simpleton> client, which needs zero history). >> >> So each peer can choose how much history to store. If a peer doesn't >> have enough history to merge an edit, it can simply request that >> history from another peer. In this draft, you do so by requesting a >> GET with both Version and Parents headers specified. >> >> *== Signatures & Validation == >> * >> >> This is out of scope for this proposal on versions. However, (a) >> there are some Version-Types that double as signatures. When this >> happens, it can be specified by authoring a Version-Type spec to >> articulate the new constraint. And (b) this is a generally important >> area of work that I encourage. >> >> Cheers! >> >> Michael >> >> On 7/22/24 11:44 AM, Michael Toomim wrote: >>> >>> We've got divergent discussion threads that I'm merging together. >>> >>> First, Peter Van Hardenberg (of Ink & Switch, Local-First, and >>> Automerge) wrote this initial review of the draft. He's cc'd, and we >>> can respond in this thread. >>> >>> ------------------------ >>> -- Peter Van Hardenberg: -- >>> ------------------------ >>> >>> Hi Michael, >>> >>> I had a quick look at the spec and gave some thought to whether we'd >>> want to adopt it. I think right now it has quite a lot of >>> per-version overhead, and viewing this through a local-first lens, >>> one can imagine having to publish a large number of versions each as >>> separate PUT calls. You might want to consider supporting ranges for >>> PUT in a single message. >>> >>> Overall, our goals appear to differ from what you're proposing here >>> so this feedback may not be particularly important. My sense is that >>> the expected granularity of changes for Braid is relatively >>> large and that the frequency is relatively long -- on par with a >>> changed HTML form submission, perhaps. We spend quite a lot of our >>> time thinking about optimizing updates for potentially thousands of >>> edits and trying to minimize the number of round trips required to >>> synchronize state in both directions. You mention that the design >>> intends to be optimizable but I didn't see much in the text that >>> clarified how. >>> >>> One other observation is that this spec does not appear to >>> prioritize retention of history: >>> > - If the Parents header is absent, the server SHOULD return a >>> > single response, containing the requested version of the resource >>> > in its body, with the Version response header set to the same >>> > version. >>> This design may centralize the system, as clients default to >>> receiving "flattened" versions of resources and thus may not be able >>> to merge changes from other sources. >>> >>> Last, have you considered specifying some kind of signature / >>> validation feature? If clients are applying patches iteratively, it >>> might help for them to be able to validate that they're in the >>> expected state either before or after applying a patch. >>> >>> All the best, >>> -p >>> >>> On 7/15/24 6:26 PM, Michael Toomim wrote: >>>> >>>> Hi everyone in HTTP! >>>> >>>> Last fall we solicited feedback on the Braid State Synchronization >>>> proposal [draft >>>> <https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http-04>, >>>> slides >>>> <https://datatracker.ietf.org/meeting/118/materials/slides-118-httpbis-braid-http-add-synchronization-to-http-00>], >>>> which I'd summarize as: >>>> >>>> "We're enthusiastic about the general work, but the proposal is >>>> too high-level. Break the spec up into multiple independent >>>> specs, and work bottom-up. Focus on concrete 'bits-on-the-wire'." >>>> >>>> So I'm breaking the spec up, and have drafted up the first chunk >>>> for you. I would very much like your review on: >>>> >>>> *Versioning of HTTP Resources* >>>> draft-toomim-httpbis-versions >>>> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions-00 >>>> >>>> Versioning is necessary for state synchronization—and occurs in a >>>> range of HTTP systems: >>>> >>>> * Caching >>>> * Archiving >>>> * Version Control >>>> * Collaborative Editing >>>> >>>> Today, HTTP has resource versions in the Last-Modified and ETag >>>> headers, and sometimes embeds versions in URLs, like with WebDAV. >>>> Each of these options serves some needs, but also has specific >>>> limitations. An improved general approach is proposed, which >>>> provides new features, that could enable cool new applications, >>>> such as incrementally-updated RSS feeds, and could simplify >>>> existing specifications, such as resumeable uploads, and history >>>> compression in OT/CRDT algorithms. >>>> >>>> I would love to know if people find this work interesting. I think >>>> we could improve performance, interoperability, and be one step >>>> closer to having Google Docs power within HTTP URLs. >>>> >>>> Michael >>>> >>>> -------- Forwarded Message -------- >>>> Subject: New Version Notification for >>>> draft-toomim-httpbis-versions-00.txt >>>> Date: Mon, 08 Jul 2024 11:02:11 -0700 >>>> From: internet-drafts@ietf.org >>>> To: Michael Toomim <toomim@gmail.com> >>>> >>>> >>>> >>>> A new version of Internet-Draft >>>> draft-toomim-httpbis-versions-00.txt has been >>>> successfully submitted by Michael Toomim and posted to the >>>> IETF repository. >>>> >>>> Name: draft-toomim-httpbis-versions >>>> Revision: 00 >>>> Title: HTTP Resource Versioning >>>> Date: 2024-07-08 >>>> Group: Individual Submission >>>> Pages: 19 >>>> URL: >>>> https://www.ietf.org/archive/id/draft-toomim-httpbis-versions-00.txt >>>> Status: https://datatracker.ietf.org/doc/draft-toomim-httpbis-versions/ >>>> HTMLized: >>>> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions >>>> >>>> >>>> Abstract: >>>> >>>> HTTP resources change over time. Each change to a resource creates a >>>> new "version" of its state. HTTP systems often need a way to >>>> identify, read, write, navigate, and/or merge these versions, in >>>> order to implement cache consistency, create history archives, settle >>>> race conditions, request incremental updates to resources, interpret >>>> incremental updates to versions, or implement distributed >>>> collaborative editing algorithms. >>>> >>>> This document analyzes existing methods of versioning in HTTP, >>>> highlights limitations, and sketches a more general versioning >>>> approach that can enable new use-cases for HTTP. >>>> >>>> >>>> >>>> The IETF Secretariat >>>> >>>>
Received on Thursday, 25 July 2024 10:24:04 UTC