- From: Michael Toomim <toomim@gmail.com>
- Date: Thu, 25 Jul 2024 03:23:56 -0700
- To: HTTP Working Group <ietf-http-wg@w3.org>, Braid <braid-http@googlegroups.com>, Peter van Hardenberg <pvh@pvh.ca>, Martin Kleppmann <martin@kleppmann.com>
- Message-ID: <f04e6822-e49a-430f-a605-6547f20b96d6@gmail.com>
Peter and Martin,
I've hit "publish" on the explanation for how to compress history with
Version-Types:
https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions#section-5.1
You can simply review these sections (5.1 and 5.2) instead of the long
list of links below. Does this address your concerns?
Thanks,
Michael
On 7/22/24 4:49 PM, Michael Toomim wrote:
>
> Peter, I just wrote up an explicit example of how to compress four
> PUTs into 7 bytes. Check out the new section 5.1 here:
>
> https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L945
>
> These four puts compress down to 0.0146% of their original size, at
> least in theory. Note that said compression scheme isn't fully
> specified in this draft — the focus of this draft is just to gather
> interest in working on a versioning system that makes such compression
> possible. The actual compression schemes would be future work.
>
> On 7/22/24 12:41 PM, Michael Toomim wrote:
>>
>> Peter, thank you for your interest! I'm excited that you are bringing
>> up performance for discussion! There's a lot to say on that, and I
>> give an overview below:
>>
>> *== Compression & Performance ==*
>>
>> First, let me correct a big misinterpretation— this work absolutely
>> prioritizes *high-performance*, *realtime* data synchronization. It
>> should support thousands of mutations per second. Our implementations
>> are higher-performance <https://josephg.com/blog/crdts-go-brrr/> than
>> Automerge, for instance. I regularly work today with a doc composed
>> of 110,000 edits. It loads instantly, thanks to some great
>> Version-Types we've designed.
>>
>> The Version-Type (in the proposed Version-Type header) is the way you
>> get performance increases. The key to performance is managing history
>> growth. You manage that by finding a pattern in history, and then
>> compressing or ignoring history. You can express those patterns as a
>> Version-Type spec. (There's a robust theory behind this called Time
>> Machines.)
>>
>> I apologize that this wasn't clear in the draft -00. I thought this
>> would be an advanced feature that people wouldn't comment on for a
>> bit — but am pleasantly surprised to hear your interest in it! I will
>> be adding more clarity to the spec on Version-Types, and already have
>> begun doing so in github:
>>
>> https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L885
>>
>> I'd also encourage you to check out this sketch of how to bake RLE
>> into HTTP Header Compression:
>>
>> https://braid.org/meeting-69/header-compression
>> https://braid.org/video/https://invisiblecollege.s3.us-west-1.amazonaws.com/braid-meeting-69.mp4#4166
>>
>> In any case, keep in mind that at this stage, we need to know only
>> whether there is /interest/ in this area of work — not whether this
>> particular spec meets your needs. If we adopt this work into the HTTP
>> WG, we will get a chance to change or rewrite any part of the spec.
>> This spec is just a starting point to get discussion going. So think
>> of this as a problem statement rather than a solution statement.
>>
>> *== PUTs ==*
>>
>> As for PUTs, I suspect you might be thinking about HTTP/1.0 where
>> each PUT might require a new TCP connection with its own TLS
>> handshake. But keep in mind that with HTTP/2 and 3, all HTTP
>> semantics are expressed in binary, and a PUT is usually just a single
>> packet! This is just as efficient as any hand-rolled protocol you
>> have, and it has the advantage of being interoperable with all of HTTP.
>>
>> *== History Retention ==
>> *
>>
>> This versioning model supports Time Machines
>> <https://braid.org/time-machines>— the beauty of which is that peers
>> become free to independently choose how much history to store. An
>> archival peer can store the full history. A light client can store
>> just the latest version (see the amazing Simpleton
>> <https://braid.org/simpleton> client, which needs zero history).
>>
>> So each peer can choose how much history to store. If a peer doesn't
>> have enough history to merge an edit, it can simply request that
>> history from another peer. In this draft, you do so by requesting a
>> GET with both Version and Parents headers specified.
>>
>> *== Signatures & Validation ==
>> *
>>
>> This is out of scope for this proposal on versions. However, (a)
>> there are some Version-Types that double as signatures. When this
>> happens, it can be specified by authoring a Version-Type spec to
>> articulate the new constraint. And (b) this is a generally important
>> area of work that I encourage.
>>
>> Cheers!
>>
>> Michael
>>
>> On 7/22/24 11:44 AM, Michael Toomim wrote:
>>>
>>> We've got divergent discussion threads that I'm merging together.
>>>
>>> First, Peter Van Hardenberg (of Ink & Switch, Local-First, and
>>> Automerge) wrote this initial review of the draft. He's cc'd, and we
>>> can respond in this thread.
>>>
>>> ------------------------
>>> -- Peter Van Hardenberg: --
>>> ------------------------
>>>
>>> Hi Michael,
>>>
>>> I had a quick look at the spec and gave some thought to whether we'd
>>> want to adopt it. I think right now it has quite a lot of
>>> per-version overhead, and viewing this through a local-first lens,
>>> one can imagine having to publish a large number of versions each as
>>> separate PUT calls. You might want to consider supporting ranges for
>>> PUT in a single message.
>>>
>>> Overall, our goals appear to differ from what you're proposing here
>>> so this feedback may not be particularly important. My sense is that
>>> the expected granularity of changes for Braid is relatively
>>> large and that the frequency is relatively long -- on par with a
>>> changed HTML form submission, perhaps. We spend quite a lot of our
>>> time thinking about optimizing updates for potentially thousands of
>>> edits and trying to minimize the number of round trips required to
>>> synchronize state in both directions. You mention that the design
>>> intends to be optimizable but I didn't see much in the text that
>>> clarified how.
>>>
>>> One other observation is that this spec does not appear to
>>> prioritize retention of history:
>>> > - If the Parents header is absent, the server SHOULD return a
>>> > single response, containing the requested version of the resource
>>> > in its body, with the Version response header set to the same
>>> > version.
>>> This design may centralize the system, as clients default to
>>> receiving "flattened" versions of resources and thus may not be able
>>> to merge changes from other sources.
>>>
>>> Last, have you considered specifying some kind of signature /
>>> validation feature? If clients are applying patches iteratively, it
>>> might help for them to be able to validate that they're in the
>>> expected state either before or after applying a patch.
>>>
>>> All the best,
>>> -p
>>>
>>> On 7/15/24 6:26 PM, Michael Toomim wrote:
>>>>
>>>> Hi everyone in HTTP!
>>>>
>>>> Last fall we solicited feedback on the Braid State Synchronization
>>>> proposal [draft
>>>> <https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http-04>,
>>>> slides
>>>> <https://datatracker.ietf.org/meeting/118/materials/slides-118-httpbis-braid-http-add-synchronization-to-http-00>],
>>>> which I'd summarize as:
>>>>
>>>> "We're enthusiastic about the general work, but the proposal is
>>>> too high-level. Break the spec up into multiple independent
>>>> specs, and work bottom-up. Focus on concrete 'bits-on-the-wire'."
>>>>
>>>> So I'm breaking the spec up, and have drafted up the first chunk
>>>> for you. I would very much like your review on:
>>>>
>>>> *Versioning of HTTP Resources*
>>>> draft-toomim-httpbis-versions
>>>> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions-00
>>>>
>>>> Versioning is necessary for state synchronization—and occurs in a
>>>> range of HTTP systems:
>>>>
>>>> * Caching
>>>> * Archiving
>>>> * Version Control
>>>> * Collaborative Editing
>>>>
>>>> Today, HTTP has resource versions in the Last-Modified and ETag
>>>> headers, and sometimes embeds versions in URLs, like with WebDAV.
>>>> Each of these options serves some needs, but also has specific
>>>> limitations. An improved general approach is proposed, which
>>>> provides new features, that could enable cool new applications,
>>>> such as incrementally-updated RSS feeds, and could simplify
>>>> existing specifications, such as resumeable uploads, and history
>>>> compression in OT/CRDT algorithms.
>>>>
>>>> I would love to know if people find this work interesting. I think
>>>> we could improve performance, interoperability, and be one step
>>>> closer to having Google Docs power within HTTP URLs.
>>>>
>>>> Michael
>>>>
>>>> -------- Forwarded Message --------
>>>> Subject: New Version Notification for
>>>> draft-toomim-httpbis-versions-00.txt
>>>> Date: Mon, 08 Jul 2024 11:02:11 -0700
>>>> From: internet-drafts@ietf.org
>>>> To: Michael Toomim <toomim@gmail.com>
>>>>
>>>>
>>>>
>>>> A new version of Internet-Draft
>>>> draft-toomim-httpbis-versions-00.txt has been
>>>> successfully submitted by Michael Toomim and posted to the
>>>> IETF repository.
>>>>
>>>> Name: draft-toomim-httpbis-versions
>>>> Revision: 00
>>>> Title: HTTP Resource Versioning
>>>> Date: 2024-07-08
>>>> Group: Individual Submission
>>>> Pages: 19
>>>> URL:
>>>> https://www.ietf.org/archive/id/draft-toomim-httpbis-versions-00.txt
>>>> Status: https://datatracker.ietf.org/doc/draft-toomim-httpbis-versions/
>>>> HTMLized:
>>>> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions
>>>>
>>>>
>>>> Abstract:
>>>>
>>>> HTTP resources change over time. Each change to a resource creates a
>>>> new "version" of its state. HTTP systems often need a way to
>>>> identify, read, write, navigate, and/or merge these versions, in
>>>> order to implement cache consistency, create history archives, settle
>>>> race conditions, request incremental updates to resources, interpret
>>>> incremental updates to versions, or implement distributed
>>>> collaborative editing algorithms.
>>>>
>>>> This document analyzes existing methods of versioning in HTTP,
>>>> highlights limitations, and sketches a more general versioning
>>>> approach that can enable new use-cases for HTTP.
>>>>
>>>>
>>>>
>>>> The IETF Secretariat
>>>>
>>>>
Received on Thursday, 25 July 2024 10:24:04 UTC