- From: Willy Tarreau <w@1wt.eu>
- Date: Sat, 27 May 2023 22:40:00 +0200
- To: Julian Reschke <julian.reschke@gmx.de>
- Cc: ietf-http-wg@w3.org
Hi Julian, On Sat, May 27, 2023 at 11:55:59AM +0200, Julian Reschke wrote: > On 27.05.2023 10:37, Willy Tarreau wrote: > > ... > > Without having read all details: > > +1 to consider (!) just using raw octets > > +1 not to use sf-binary > > +1 to exclude ASCII controls (but not entirely sure about CR LF HTAB) > > but > > -1 to use anything but UTF-8 (I fail to see any reason for that) - and > no, use of UTF-8 does require revising things when Unicode code points > are added Unless I'm totally mistaken, the maximum sequence length has increased over time to support new code points. I remember having myself implemented decoding functions a long time ago in a security component where we were required to fail past 4 or maybe 5 bytes, and that I later learned that they had to extend it by one or two bytes to support new code points. I don't remember the exact details but my point is that we must not impose this absurdly insecure decoding to infrastructure components, or they will regularly be accusated of blocking valid contents :-/ As long as they can pass it as-is and it's the recipient's goal to figure if they successfully decode or not, that's fine by me. Regards, Willy
Received on Saturday, 27 May 2023 20:40:07 UTC