Re: UTF-8 in URIs

On Fri, Jan 17, 2014 at 7:10 AM, Michael Sweet <msweet@apple.com> wrote:
> With all due respect, all of the protocols we use on the Internet use octets as the basis of text strings, and in particular most strings passed over the Internet (headers, header values, URIs, hostnames, etc.) do not even need support beyond US ASCII.  This is a huge benefit for interoperability at the

It is interesting that a lot of non-English resources have URLs with
English words. However, technical limitations may be a contributing
factor. It probably should not be taken as a basic assumption when
designing new technical specs.

tiny expense of expansion for some languages.
>
> The place where UTF-16 has the most benefit is in the content that is transferred, not in the fractional protocol overhead used to transfer that content.  (and even there, I would compare the size of your content encoded as UTF-8 and as UTF-16 before making a decision - web pages often are better off as UTF-8 due to the HTML markup, while email messages are better as UTF-16, for example...)
>
>
> On Jan 17, 2014, at 7:53 AM, Zhong Yu <zhong.j.yu@gmail.com> wrote:
>> ...
>> An UTF-16 option would be nice. Let's be honest, UTF-8 is
>> English-centric. It may be necessary to interoprate with previous
>> ASCII based systems. But going forward, UTF-8 should not be favored
>> just because it is the best option for the English language.
>>
>> Zhong Yu
>>
>
> _________________________________________________________
> Michael Sweet, Senior Printing System Engineer, PWG Chair
>

Received on Saturday, 18 January 2014 01:56:03 UTC