- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Tue, 18 Nov 2025 08:29:48 +0100
- To: ietf-http-wg@w3.org
On 17.11.2025 03:55, Martin Thomson wrote:
> This looks pretty good.
>
> There is one thing that occurred to me reading this, which might be worth noting. This depends on having codings that produce exactly one sequence of bytes when decoded.
>
> RFC 9110 says that a content coding transforms the content "without losing the identity of its underlying media type and without loss of information". I can think of any number of codings that would not always transform back into the same sequence of bytes.
>
> If you are struggling to imagine something, consider a possible json-as-cbor coding scheme. In this, the JSON infoset is serialized using the (more compact) CBOR encoding. This is a reversible transform[1] that doesn't change semantics or lose information, but it might not retain things like the spaces between tokens when inverted/decoded (`{"thing":true}` vs `{ "thing": true }` or "\u0009" vs "\t", for instance). Digests of the restored serialization won't work in that case.
>
> Maybe that's not what RFC 9110 intended and these coding schemes are not permitted. But that's not what it *says*, unless you take a very specific interpretation of the word "identity".
>
> Nit: It might be worth noting that the gzip encoding in the examples is hex encoded (with added spacing for formatting purposes).
>
> [1] OK, maybe I-JSON, to avoid questions about some of JSON's more extreme quirks.
> ...
Looking at the content coding registry...
https://www.w3.org/TR/exi/
...seems to be based on the infoset of an XML document, and thus not
round-trip the exact sequence of bytes.
Best regards, Julian
Received on Tuesday, 18 November 2025 07:29:56 UTC