Re: Consensus call to include Display Strings in draft-ietf-httpbis-sfbis

Hello Ilari, others,

On 2023-05-26 18:52, Ilari Liusvaara wrote:
> On Thu, May 25, 2023 at 10:21:34AM -0700, Roy T. Fielding wrote:
>>
>> If this is truly for a display string, the feature must be
>> specific about the encoding and allowed characters.
>> My suggestion would be to limit the string to non-CNTRL
>> ASCII and non-control valid UTF-8. We don't want to allow
>> anything that would twist the feature to some other ends.
> 
> I think the set of allowed characters should be the 1,111,999 non-Cc
> unicode codepoints.
> 
> However, unicode also has formatting control codepoints (including
> fun ones like direction overrides), and the set of those is not
> necressarily stable. Obviously, the effect of any formatting control
> should end with the string.

Bidirectional formatting characters should best be left in, because they 
may be needed in display strings in Arabic, Hebrew, or other 
right-to-left scripts.

> I think it would be safer to add exactly one backslash escape sequence
> for the 1,111,904 codepoints that are neither Cc nor ASCII. The
> escape sequences should only consist of printable ASCII and should not
> contain further backslash nor dobule quote.
> 
> It is possible to assign the escape sequences such that worst case
> overhead over UTF-8 is 1 byte per codepoint.

It sounds to me as if you are trying to invent a new form of escaping 
(or encoding). If you really think that's the direction we should move 
in, can you be a bit more specific (maybe with a few examples)?

Regards,   Martin.

Received on Sunday, 28 May 2023 07:57:20 UTC