Re: [EXTERNAL] Proposal adaptation request from Seph Gentle on 2025-05-14 (ietf-http-wg@w3.org from April to June 2025)

From: Seph Gentle <me@josephg.com>
Date: Thu, 15 May 2025 07:16:57 +1000
To: "Guohui Deng" <guohuideng@microsoft.com>, "Roy T. Fielding" <fielding@gbiv.com>
Cc: "Yoav Weiss" <yoav.weiss@shopify.com>, "HTTP Working Group" <ietf-http-wg@w3.org>, "Anne van Kesteren" <annevk@annevk.nl>, "Mark Nottingham" <mnot@mnot.net>
Message-Id: <cb34b6bb-1653-4b69-8e94-41405e3fe974@app.fastmail.com>

All of these answers seem like a bad design to me. It sounds like what you want is this:

enum ContentEncoding {
  Known(string),
  Unknown,
  NotPresent
}

Using a sentinal value for “none” / null is Hoare’s billion dollar mistake. An unrecognised content encoding isn’t a type of content encoding at all. If you ask me, it shouldn’t be treated or stored as such.

This design uses sum types - which means it isn’t always practical with modern databases and programming languages. I think I’ll never understand why languages like sql, c++ and go are missing sum types.

-Seph

On Thu, 15 May 2025, at 3:36 AM, Guohui Deng wrote:
> Thanks Roy! I will proceed with "_". Really appreciate the detailed information and your guidance.
> 
> Guohui
> 
> 
> *From:* Roy T. Fielding <fielding@gbiv.com>
> *Sent:* Monday, May 12, 2025 2:30 PM
> *To:* Guohui Deng <guohuideng@microsoft.com>
> *Cc:* Yoav Weiss <yoav.weiss@shopify.com>; ietf-http-wg@w3.org <ietf-http-wg@w3.org>; Anne van Kesteren <annevk@annevk.nl>
> *Subject:* Re: [EXTERNAL] Proposal adaptation request
> 
> On May 12, 2025, at 12:38 PM, Guohui Deng <guohuideng@microsoft.com> wrote:
>> 
>> Hi Roy and Yoav, 
>> 
>> Besides "_",  is there something else like "_unknown" a "non-HTTP value that cannot be registered"?
>> 
>> For us, "_unknown" is better than "_".  I checked "rfc8941" and I don't see any restrictions on the values in that doc. I guess it cannot be registered but I am not sure.
>> 
>> Cheers,
>> Guohui
> 
> "_" could theoretically be registered by IETF consensus. It seems unlikely given that nothing "short" is ever registered by consensus. The content-coding grammar is a token
> 
>   token          = 1*tchar
> 
>   tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*"
>                  / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
>                  / DIGIT / ALPHA
>                  ; any VCHAR, except delimiters
> 
> The nice thing about "_" is that quite a few dead bodies would have to be crossed for that to be registered, and yet it still remains syntactically valid as a token (in case that matters), is easy to check, and looks good.
> 
> But, as I said before, it is not an IETF concern so long as it remains inside the API as a non-protocol marker.
> If the API allows Unicode, "💩" would also be fine (and even less likely to be a conflict).
> 
> ....Roy
>

Received on Wednesday, 14 May 2025 21:17:40 UTC