Re: What is a Version?

On Mar 9, 2026, at 7:01 PM, Michael Toomim <toomim@gmail.com> wrote:
> 
> Hey all,
> 
> After surveying the different approaches to Versioning in HTTP for draft -04, I'm realizing there's an important design question for the group here:
> 
>          ===   Q. "What is a Version in HTTP?"  ===
> 
> Back in October 2024, Julian Reschke said "a version is a HTTP resource".  This definition implied that, for instance, if a version doesn't exist, we could return 404:
> 
> https://lists.w3.org/Archives/Public/ietf-http-wg/2024OctDec/0172.html
> 
> This is interesting.  Credit to Julian for recognizing the impact of this question, and suggesting we work to define how these basic concepts work in HTTP.
> 
> When I surveyed the existing versioning approaches in -04, I found two distinct models:
> 
> First, in WebDAV Versioning, Memento, and Link Relations, a version is itself a resource with its own URI:
> 
>   - In WebDAV Versioning (RFC 3253), a "version resource" is an
>     immutable HTTP resource at a server-assigned URL, created when a
>     "version-controlled resource" is checked in.  You GET the version
>     at its own URL.
> 
>   - In Memento (RFC 7089), a "Memento" (URI-M) is a resource that
>     encapsulates a prior state of an "Original Resource" (URI-R).  It
>     has its own URI, and you navigate to it via a TimeGate Resource.
> 
>   - In Link Relations for Version Navigation (RFC 5829), the link
>     relation types (predecessor-version, successor-version) assume
>     you're navigating between version resources, each with its own URI.
> 
> In all three, a version is a thing you dereference -- a resource at a URL.  If it's missing, 404 is natural.

They each refer to a different (defined) conception of what a "version" may be. That's why they all differ in their own ways.

The concept of "resource" comes from the Web and IETF discussions around identifiers. The word is a generic in the sense that is a placeholder noun for any conceived mapping from an identifier to anything we might want to identify. But, at the same time, "resource" has the traditional meaning of something that (was/is/will be) available for use. Hence, something doesn't become a "resource" until it has been identified, even though that same thing exists before/during/after it has been identified. The philosophical conundrum is not a problem for the Web specs because we stick to the original design: All important resources must be identified by at least one URI. If it doesn't have a URI, then it isn't important, and we enforce that in HTTP by not performing operations on anything without its target URI. [This of course doesn't prevent people from doing private manipulation of other things via POST, but those interactions are decidedly not standardized as HTTP.]

As Julian just mentioned, a server provides resources by minting URIs. Many, many URIs. A single interesting representation might have dozens of different URIs associated with it, or parts of it, and the server uses links (in HTML or Link or Link-Template or custom fields, depending on history) to describe the relationships so that a client can find what it wants.

> Second, we have the Last-Modified and ETag headers, and the Version and Parents headers proposed in this draft.  Instead of version being a resource with a URI, these simply identify a resource at a *point in time*.  The version is a "coordinate of a resource."

The problem here is that this is using the term "resource" to mean the clump of data that happens to be the result of a prior GET, as it is defined in some of the browser-specific WHATWG specs. That's leading to confusion because we are not talking here about the internal storage context of a browser. No Web resources are ever transferred over HTTP. Only a representation is transferred.

Next, it assumes that each observable change in the representation of a resource is a version of that resource, which has two very big problems: 1) some changes are variants of the same response (for whatever reason the origin chooses) and some changes reflect change in the resource state (or implementation); and, 2) some new versions are minted without any change at all. The Last-Modified and ETag metadata are not identifiers of a representation/version. Last-Modified provides a time value within the accuracy of a second. A resource might have many different representations with the same Last-Modified. That's why we needed ETag for caching, but even ETag is not an identifier for a representation! It's a token that is presumed to be conflict-free among a set of representations for a resource over the scope of a given cache entry lifetime. There might be multiple ETags assigned to the same sequence of bytes, which may or may not be associated with the same representation (because of metadata).

>  Last-Modified and ETag use coordinate as a condition for a request:
> 
>   - Last-Modified:
> 
>         GET /doc
>         If-Unmodified-Since: Tue, 15 Oct 2024 12:00:00 GMT
> 
>   - ETag:
> 
>         GET /doc
>         If-Match: "abc5"
> 
> This draft extends that model by letting the coordinate select a representation directly:
> 
>         GET /doc
>         Version: "abc5"
> 
>     or:
> 
>         GET /doc
>         Parents: "abc4"
> 
> This is analogous to how Range requests work.  When a client sends `Range: bytes 500-1000`, it isn't requesting a different resource.  The range is a coordinate within the resource.  We do not give ranges their own URIs, such as `https://example.com/bytes/500-1000` or `urn:bytes:500-1000`.

A Range request has the target resource as a URI. The range refers only to a potentially successful response representation to GET on that URI. The If-Match and If-Range header fields are designed to fail safe if the server selects a sequence of bytes that is not the same (by ETag) as what the client expects. This is sufficient for preventing mismatch (if implemented correctly) but is not sufficient to identify the sequence of bytes beyond the scope of the past (partially stored) response and does not identify the representation because it doesn't prevent changes in metadata.

> Thus, it appears there are two models of "versioning" in HTTP:
> 
>  1. A version is a HTTP resource
>  2. A version is a coordinate of a HTTP resource, in time
> 
> So the question for the group: What is the right model for HTTP?
> What is a version?

I believe that was decided in 1994. Everything important is a resource. If you think versions are important for HTTP, then they must be resources. That's why the HTTP-related versioning extensions always end up referring to versions as resources and using some sort of link to describe their relationships. Because literally everything else in HTTP (access control, auth, caching, proxies, gateways, CDNs, etc.) depends on resources.

....Roy

Received on Thursday, 12 March 2026 19:03:26 UTC