Re: Consensus call to include Display Strings in draft-ietf-httpbis-sfbis from Ilari Liusvaara on 2023-06-29 (ietf-http-wg@w3.org from April to June 2023)

From: Ilari Liusvaara <ilariliusvaara@welho.com>
Date: Thu, 29 Jun 2023 14:50:36 +0300
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <ZJ1wDMnj9IiBbmlU@LK-Perkele-VII2.locald>

On Thu, Jun 29, 2023 at 09:19:42AM +0000, Poul-Henning Kamp wrote:
> --------
> Ilari Liusvaara writes:
> 
> > 2) I think it should be specified that any direction change characters
> > MUST NOT affect any text surrounding the displayed string. At least
> > getting this wrong causes at most some screwed up text rendering.
> 
> There is no way to make UniCode safe, because it is anyones guess what
> UniCode decides to add later.

I did some digging about when Unicode last added some "interesting"
stuff. The last one I could find was some additional direction
overrides from 2013. All the other "interesting" stuff seems to be
from 1993 (the very first version of Unicode). And the Cc stuff seems
to be even older than that.

> I dont think it makes any sense for us to wade into that sump,
> beyond a sternly written "Security Considerations" which says
> that UniCode is by definition unsafe.
> 
> Avoiding any and all hazards related to that /at the HTTP level/, is
> why I still think we should base64 encode them, instead of the mutant
> percent-with-the-random-backslash-thrown-in currently proposed.

How would that help? Even currently, all that stuff must be escaped.
And the hazards of unicode are associated with displaying it, and then
it does not matter if it was percent-encoded or base64-encoded.

Between percent-encoding and base64 it is merely about efficiency.
However, encoding not capable of representing Cc would be entierely
different thing. And clearly Cc contains by far the most hazardous
stuff in the entiere Unicode.

-Ilari

Received on Thursday, 29 June 2023 11:50:45 UTC