Re: Consensus call to include Display Strings in draft-ietf-httpbis-sfbis from Julian Reschke on 2023-05-28 (ietf-http-wg@w3.org from April to June 2023)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 28 May 2023 05:51:49 +0200
To: ietf-http-wg@w3.org
Message-ID: <83822f88-e1ae-708c-22b7-4be44e15d274@gmx.de>

On 27.05.2023 22:40, Willy Tarreau wrote:
> Hi Julian,
>
> On Sat, May 27, 2023 at 11:55:59AM +0200, Julian Reschke wrote:
>> On 27.05.2023 10:37, Willy Tarreau wrote:
>>> ...
>>
>> Without having read all details:
>>
>> +1 to consider (!) just using raw octets
>>
>> +1 not to use sf-binary
>>
>> +1 to exclude ASCII controls (but not entirely sure about CR LF HTAB)
>>
>> but
>>
>> -1 to use anything but UTF-8 (I fail to see any reason for that) - and
>> no, use of UTF-8 does require revising things when Unicode code points
>> are added
>
> Unless I'm totally mistaken, the maximum sequence length has increased
> over time to support new code points. I remember having myself implemented
> decoding functions a long time ago in a security component where we were
> required to fail past 4 or maybe 5 bytes, and that I later learned that
> they had to extend it by one or two bytes to support new code points. I
> don't remember the exact details but my point is that we must not impose
> this absurdly insecure decoding to infrastructure components, or they
> will regularly be accusated of blocking valid contents :-/  As long as
> they can pass it as-is and it's the recipient's goal to figure if they
> successfully decode or not, that's fine by me.

AFAIU, the UTF-8 encoding/decoding function (sequence of code points to
octets and vice versa) never has changed (see
https://datatracker.ietf.org/doc/html/rfc3629#section-3). Am I missing
something here?

Best regards, Julian

Received on Sunday, 28 May 2023 03:51:55 UTC