Another place where we may need to know about normalization is for caching.
Does the lookup, etc. occur on the normalized form, or on the given data?
All in all, utf-8 without addendum sucks for protocol work.
-=R
On Sun, Feb 10, 2013 at 2:24 AM, Poul-Henning Kamp <phk@phk.freebsd.dk>wrote:
> Content-Type: text/plain; charset=ISO-8859-1
> --------
> In message <51176C95.1040308@gmx.de>, Julian Reschke writes:
>
> >> This is why I keep asking people where _exactly_ it is they want
> >> the unicode to go in the HTTP/2 protocol. So far I fail to detect
> >> a clear answer...
> >
> >1) Filenames in Content-Disposition
>
> These only have meaning to the ultimate destinations, and if their
> filesystems don't support UTF-8, they'll have to do $something anyway.
>
> Nobody in the HTTP/2 protocol-chain can do anything but treat this
> as an opaque bytestring.
>
> >2) non-ASCII characters in HTTP auth credentials
>
> Same.
>
> >3) title parameters in Link header fields
>
> Same.
>
> The UTF-8 Questions imply does not apply at the protocol layer,
> it only applies to the semantic interpretation at the ends of
> the HTTP/2 protocol connection.
>
> Or to put it more precisely: I can see no place where an
> HTTP/2 intermediate without a semantic role will ever need
> to know about normalizing UTF-8 strings.
>
> Agree ?
>
> Since we, presumably, split HTTP into a transport and semantic
> part in HTTPbis, and since HTTP/2 is not supposed to change
> the semantics, why are we even discussing "UTF-8 in HTTP/2" ?
>
>
> --
> Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
> phk@FreeBSD.ORG | TCP/IP since RFC 956
> FreeBSD committer | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.
>
>