- From: Michael Toomim <toomim@gmail.com>
- Date: Mon, 22 Jul 2024 16:49:37 -0700
- To: HTTP Working Group <ietf-http-wg@w3.org>, Braid <braid-http@googlegroups.com>, Peter van Hardenberg <pvh@pvh.ca>
- Message-ID: <ba9bd07d-b648-4afc-8c78-4ec05d2e1797@gmail.com>
Peter, I just wrote up an explicit example of how to compress four PUTs
into 7 bytes. Check out the new section 5.1 here:
https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L945
These four puts compress down to 0.0146% of their original size, at
least in theory. Note that said compression scheme isn't fully specified
in this draft — the focus of this draft is just to gather interest in
working on a versioning system that makes such compression possible. The
actual compression schemes would be future work.
On 7/22/24 12:41 PM, Michael Toomim wrote:
>
> Peter, thank you for your interest! I'm excited that you are bringing
> up performance for discussion! There's a lot to say on that, and I
> give an overview below:
>
> *== Compression & Performance ==*
>
> First, let me correct a big misinterpretation— this work absolutely
> prioritizes *high-performance*, *realtime* data synchronization. It
> should support thousands of mutations per second. Our implementations
> are higher-performance <https://josephg.com/blog/crdts-go-brrr/> than
> Automerge, for instance. I regularly work today with a doc composed of
> 110,000 edits. It loads instantly, thanks to some great Version-Types
> we've designed.
>
> The Version-Type (in the proposed Version-Type header) is the way you
> get performance increases. The key to performance is managing history
> growth. You manage that by finding a pattern in history, and then
> compressing or ignoring history. You can express those patterns as a
> Version-Type spec. (There's a robust theory behind this called Time
> Machines.)
>
> I apologize that this wasn't clear in the draft -00. I thought this
> would be an advanced feature that people wouldn't comment on for a bit
> — but am pleasantly surprised to hear your interest in it! I will be
> adding more clarity to the spec on Version-Types, and already have
> begun doing so in github:
>
> https://github.com/braid-org/braid-spec/blob/master/draft-toomim-httpbis-versions-01.txt#L885
>
> I'd also encourage you to check out this sketch of how to bake RLE
> into HTTP Header Compression:
>
> https://braid.org/meeting-69/header-compression
> https://braid.org/video/https://invisiblecollege.s3.us-west-1.amazonaws.com/braid-meeting-69.mp4#4166
>
> In any case, keep in mind that at this stage, we need to know only
> whether there is /interest/ in this area of work — not whether this
> particular spec meets your needs. If we adopt this work into the HTTP
> WG, we will get a chance to change or rewrite any part of the spec.
> This spec is just a starting point to get discussion going. So think
> of this as a problem statement rather than a solution statement.
>
> *== PUTs ==*
>
> As for PUTs, I suspect you might be thinking about HTTP/1.0 where each
> PUT might require a new TCP connection with its own TLS handshake. But
> keep in mind that with HTTP/2 and 3, all HTTP semantics are expressed
> in binary, and a PUT is usually just a single packet! This is just as
> efficient as any hand-rolled protocol you have, and it has the
> advantage of being interoperable with all of HTTP.
>
> *== History Retention ==
> *
>
> This versioning model supports Time Machines
> <https://braid.org/time-machines>— the beauty of which is that peers
> become free to independently choose how much history to store. An
> archival peer can store the full history. A light client can store
> just the latest version (see the amazing Simpleton
> <https://braid.org/simpleton> client, which needs zero history).
>
> So each peer can choose how much history to store. If a peer doesn't
> have enough history to merge an edit, it can simply request that
> history from another peer. In this draft, you do so by requesting a
> GET with both Version and Parents headers specified.
>
> *== Signatures & Validation ==
> *
>
> This is out of scope for this proposal on versions. However, (a) there
> are some Version-Types that double as signatures. When this happens,
> it can be specified by authoring a Version-Type spec to articulate the
> new constraint. And (b) this is a generally important area of work
> that I encourage.
>
> Cheers!
>
> Michael
>
> On 7/22/24 11:44 AM, Michael Toomim wrote:
>>
>> We've got divergent discussion threads that I'm merging together.
>>
>> First, Peter Van Hardenberg (of Ink & Switch, Local-First, and
>> Automerge) wrote this initial review of the draft. He's cc'd, and we
>> can respond in this thread.
>>
>> ------------------------
>> -- Peter Van Hardenberg: --
>> ------------------------
>>
>> Hi Michael,
>>
>> I had a quick look at the spec and gave some thought to whether we'd
>> want to adopt it. I think right now it has quite a lot of per-version
>> overhead, and viewing this through a local-first lens, one can
>> imagine having to publish a large number of versions each as separate
>> PUT calls. You might want to consider supporting ranges for PUT in a
>> single message.
>>
>> Overall, our goals appear to differ from what you're proposing here
>> so this feedback may not be particularly important. My sense is that
>> the expected granularity of changes for Braid is relatively large and
>> that the frequency is relatively long -- on par with a changed HTML
>> form submission, perhaps. We spend quite a lot of our time thinking
>> about optimizing updates for potentially thousands of edits and
>> trying to minimize the number of round trips required to synchronize
>> state in both directions. You mention that the design intends to be
>> optimizable but I didn't see much in the text that clarified how.
>>
>> One other observation is that this spec does not appear to prioritize
>> retention of history:
>> > - If the Parents header is absent, the server SHOULD return a
>> > single response, containing the requested version of the resource
>> > in its body, with the Version response header set to the same
>> > version.
>> This design may centralize the system, as clients default to
>> receiving "flattened" versions of resources and thus may not be able
>> to merge changes from other sources.
>>
>> Last, have you considered specifying some kind of signature /
>> validation feature? If clients are applying patches iteratively, it
>> might help for them to be able to validate that they're in the
>> expected state either before or after applying a patch.
>>
>> All the best,
>> -p
>>
>> On 7/15/24 6:26 PM, Michael Toomim wrote:
>>>
>>> Hi everyone in HTTP!
>>>
>>> Last fall we solicited feedback on the Braid State Synchronization
>>> proposal [draft
>>> <https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http-04>,
>>> slides
>>> <https://datatracker.ietf.org/meeting/118/materials/slides-118-httpbis-braid-http-add-synchronization-to-http-00>],
>>> which I'd summarize as:
>>>
>>> "We're enthusiastic about the general work, but the proposal is
>>> too high-level. Break the spec up into multiple independent
>>> specs, and work bottom-up. Focus on concrete 'bits-on-the-wire'."
>>>
>>> So I'm breaking the spec up, and have drafted up the first chunk for
>>> you. I would very much like your review on:
>>>
>>> *Versioning of HTTP Resources*
>>> draft-toomim-httpbis-versions
>>> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions-00
>>>
>>> Versioning is necessary for state synchronization—and occurs in a
>>> range of HTTP systems:
>>>
>>> * Caching
>>> * Archiving
>>> * Version Control
>>> * Collaborative Editing
>>>
>>> Today, HTTP has resource versions in the Last-Modified and ETag
>>> headers, and sometimes embeds versions in URLs, like with WebDAV.
>>> Each of these options serves some needs, but also has specific
>>> limitations. An improved general approach is proposed, which
>>> provides new features, that could enable cool new applications, such
>>> as incrementally-updated RSS feeds, and could simplify existing
>>> specifications, such as resumeable uploads, and history compression
>>> in OT/CRDT algorithms.
>>>
>>> I would love to know if people find this work interesting. I think
>>> we could improve performance, interoperability, and be one step
>>> closer to having Google Docs power within HTTP URLs.
>>>
>>> Michael
>>>
>>> -------- Forwarded Message --------
>>> Subject: New Version Notification for
>>> draft-toomim-httpbis-versions-00.txt
>>> Date: Mon, 08 Jul 2024 11:02:11 -0700
>>> From: internet-drafts@ietf.org
>>> To: Michael Toomim <toomim@gmail.com>
>>>
>>>
>>>
>>> A new version of Internet-Draft draft-toomim-httpbis-versions-00.txt
>>> has been
>>> successfully submitted by Michael Toomim and posted to the
>>> IETF repository.
>>>
>>> Name: draft-toomim-httpbis-versions
>>> Revision: 00
>>> Title: HTTP Resource Versioning
>>> Date: 2024-07-08
>>> Group: Individual Submission
>>> Pages: 19
>>> URL:
>>> https://www.ietf.org/archive/id/draft-toomim-httpbis-versions-00.txt
>>> Status: https://datatracker.ietf.org/doc/draft-toomim-httpbis-versions/
>>> HTMLized:
>>> https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions
>>>
>>>
>>> Abstract:
>>>
>>> HTTP resources change over time. Each change to a resource creates a
>>> new "version" of its state. HTTP systems often need a way to
>>> identify, read, write, navigate, and/or merge these versions, in
>>> order to implement cache consistency, create history archives, settle
>>> race conditions, request incremental updates to resources, interpret
>>> incremental updates to versions, or implement distributed
>>> collaborative editing algorithms.
>>>
>>> This document analyzes existing methods of versioning in HTTP,
>>> highlights limitations, and sketches a more general versioning
>>> approach that can enable new use-cases for HTTP.
>>>
>>>
>>>
>>> The IETF Secretariat
>>>
>>>
Received on Monday, 22 July 2024 23:49:45 UTC