Re: What is a Version? from Michael Toomim on 2026-03-16 (ietf-http-wg@w3.org from January to March 2026)

From: Michael Toomim <toomim@gmail.com>
Date: Tue, 17 Mar 2026 07:25:32 +0800
To: ietf-http-wg@w3.org, Braid <braid-http@googlegroups.com>
Message-ID: <5c722a99-72bd-4cb3-afc6-82613f031587@gmail.com>
Thank you, everyone, for this great discussion!  Let me summarize what
I've heard, what I've learned, and show a way forward.

As Mike Bishop and Mnot emphasize, the perspectives of (1) and (2) are
not themselves in conflict, but, as Julian and Nico point out, they use
similar words for different meanings.  This can lead us to think we're
talking about one thing when there are actually two things.

I now see two separate concerns in "versioning":

  (1) This draft *timestamps existing representation data* in existing
      HTTP messages as they flow over the wire.  These timestamps help
      machines sync representations.  They are not seen by humans.  They
      are not exposed in the resource model, and do not impact URIs.

  (2) WebDAV, Memento, and Link Relations organize the *minting of new
      version resources* with new URIs.  They help humans and machines
      manage resource versions via these version resources.  They do
      nothing to timestamp existing representations as they flow over the
      wire in existing HTTP messages.

If we distinguish these two concerns, they actually fit together nicely.
But we need more precise language.

I realize that I actually created a bunch of confusion with the
terminology I chose at the root of this draft 04.  Specifically, the
wording of the draft's title itself -- **HTTP Resource Versioning** --
creates confusion in three ways:

    1. The word *"Version"* can ambiguously mean "a point in time", or "a
       version of a resource at a point in time".  This draft is talking
       only about the "point in time", and the title could reduce
       ambiguity by just saying "Time" directly instead of "Version."

    2. The word *"Resource"* is incorrect.  It should say
       *"Representation"*; both in the title, and throughout the draft.
       This draft says nothing normative about resources.  All it does is
       specify timestamps for representations as they are transferred.

       Thanks to Roy for making this clear, with: "No Web resources are
       ever transferred over HTTP.  Only a representation is
       transferred."  Thanks Rahul for pointing out where RFC 9110
       defines that representations change in time, and that it has yet
       to define how to track the changes.

    3. Finally, the word *"Versioning"* implies a much larger scope than
       is necessary -- it brings to mind a plethora of things people do
       with versions, like bookmarking them, navigating them, and
       checking them in and out -- that are out of scope for this draft.
       We just need to synchronize representations between machines.

       Thank you Tim Bray, Carsten and Anders for making this clear.

I'm retitling this draft to something like "Time Semantics for HTTP
Message Representations."

There are basic semantics we need here.  Roy pointed out that there's no
way in HTTP today to say:

    "The representation data I just gave you, at time X, will always be
     the same bytes when anyone references version X of the
     representation."

ETags don't solve this, because they can change with a server reboot, or
across shards in a load balancer.  This also isn't about the
*conceptual* resource version -- your Facebook profile might not have
changed in years, but the HTML representation changes frequently as new
features, ads, and A/B tests are implemented on the platform.  To
implement data synchronization, we need to specify the
representation-level semantics for time.

A timestamped message for /doc could look like this:

     200 OK
     Content-Type: text/plain
     Content-Length: 15
     Content-Version: "foo-1"

     Hello Everyone!

It implies the body will always be the same at "foo-1".

We can then build on that, by specifying:

   - A partial ordering of time.  This lets machines compare two
     timestamps and know if one came after the other, or if they occurred
     in parallel.

   - Semantics for GET, PUT, POST, PATCH, HEAD and DELETE requests and
     responses.  A client needs to be able to say "give me updates since
     version X", or "I'm writing against version Y."  A server can say
     "here is the representation at version Z, which descended from
     versions Y1 and Y2."

   - Extensibility in timestamps, so that implementations can pursue
     optimizations such as lamport clocks, vector clocks, version
     vectors, bytestreams, and run-length encodings without breaking the
     basic semantics.

These are the basic definitions we need.

Now plugging this into (2) is simple -- as Julian points out, a server
can mint a URI for a representation version at any time, however it
wants, and convey it in the Content-Location header:

     200 OK
     Content-Type: text/plain
     Content-Length: 15
     Content-Version: "foo-1"
     Content-Location: /doc?v=foo-1

     Hello Everyone!

I hope this makes clear why I don't think these basic timestamps need to
be URIs.  They identify points in time within the context of a HTTP
message stream for a known resource -- not across different contexts,
applications, or protocols.  The analogous Content-Range: `bytes 50-100`
value doesn't need a URI, and neither does the Content-Length: `15`
value, nor does a Content-Version point in time.  When an application
does want a URI for a version, it can put one into the Content-Location
header.

Does the group recognize (1) and (2) as separate concerns?

I am excited by this clarity.  I would like to account for your thoughts
as I revise draft-05 now.  Please also find me in person in Shenzhen to
chat!

Thank you!
Michael
Received on Monday, 16 March 2026 23:25:45 UTC