Re: WG Last Call: draft-ietf-httpbis-unencoded-digest-01 (Ends 2025-11-30) from Julian Reschke on 2025-11-18 (ietf-http-wg@w3.org from October to December 2025)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Tue, 18 Nov 2025 08:29:48 +0100
To: ietf-http-wg@w3.org
Message-ID: <3b173363-e4ef-4f57-9868-e2edbfb667ea@gmx.de>

On 17.11.2025 03:55, Martin Thomson wrote:
> This looks pretty good.
> 
> There is one thing that occurred to me reading this, which might be worth noting.  This depends on having codings that produce exactly one sequence of bytes when decoded.
> 
> RFC 9110 says that a content coding transforms the content "without losing the identity of its underlying media type and without loss of information".  I can think of any number of codings that would not always transform back into the same sequence of bytes.
> 
> If you are struggling to imagine something, consider a possible json-as-cbor coding scheme.  In this, the JSON infoset is serialized using the (more compact) CBOR encoding.  This is a reversible transform[1] that doesn't change semantics or lose information, but it might not retain things like the spaces between tokens when inverted/decoded (`{"thing":true}` vs `{ "thing": true }` or "\u0009" vs "\t", for instance).  Digests of the restored serialization won't work in that case.
> 
> Maybe that's not what RFC 9110 intended and these coding schemes are not permitted.  But that's not what it *says*, unless you take a very specific interpretation of the word "identity".
> 
> Nit: It might be worth noting that the gzip encoding in the examples is hex encoded (with added spacing for formatting purposes).
> 
> [1] OK, maybe I-JSON, to avoid questions about some of JSON's more extreme quirks.
> ...

Looking at the content coding registry...

   https://www.w3.org/TR/exi/

...seems to be based on the infoset of an XML document, and thus not 
round-trip the exact sequence of bytes.

Best regards, Julian

Received on Tuesday, 18 November 2025 07:29:56 UTC