Re: UTF-8 in URIs

On Fri, Jan 17, 2014 at 4:53 AM, Zhong Yu <zhong.j.yu@gmail.com> wrote:

>
> An UTF-16 option would be nice. Let's be honest, UTF-8 is
> English-centric. It may be necessary to interoprate with previous
> ASCII based systems. But going forward, UTF-8 should not be favored
> just because it is the best option for the English language.
>

UTF-16 is a horrorshow, what with its surrogates, the inability to handle
it in C code as either char* or wchar_t *, and so on.  Yes, I agree that
UTF-8 is sort of bigoted.  But it has a lot of advantages, and actually I
find it had to worry too much about the somewhat-less-than-50% overhead
(less than 50% due to ASCII markup), because when I am having trouble with
network congestion, the congestion is always due to media files... you need
a *lot* of text to match a few seconds of music or video.

Without expressing an opinion on exactly what to say about URIs, I
definitely think everything should be UTF-8 wherever possible, and note
with pleasure that the Internet is moving steadily in that direction.



>
> Zhong Yu
>
>

Received on Saturday, 18 January 2014 17:34:48 UTC