Re: UTF-8 URL for testing

Larry Masinter (masinter@parc.xerox.com)
Thu, 10 Apr 1997 20:19:56 PDT


Message-Id: <334DADDC.5CBC@parc.xerox.com>
Date: Thu, 10 Apr 1997 20:19:56 PDT
From: Larry Masinter <masinter@parc.xerox.com>
To: Francois Yergeau <yergeau@alis.com>
Cc: uri@bunyip.com
Subject: Re: UTF-8 URL for testing

Thank you, Francois, for providing some actual data.

Martin's ad hominem attacks are infuriating. It may
seem like it should "go without saying", but I appreciate
your civility and willingness to deal with the actual
facts.

Does Alis provide its documentation online? Can you
point us to the place where the use of %-hex encoded
UTF-8 encoded Unicode in URLs is documented?

The URLs you point us to are all in your personal
area (~yergeau) on alis.com. Why aren't any of
the other URLs on the alis site internationalized,
since it is compatible with current browsers?

You say:

> Click on the links
> within to see whether *your* browser handles those crazy,
> out-in-left-field, non-ASCII constructs.  I've found three that work fine,
> and two of them together represent almost the whole browser installed base.

Did you try any browsers that didn't work? Do any of the browsers
display the URLs as anything other than %xx%xx%xx in the 'location' box?

Is there any software anywhere in the world that actually generates
URLs like these? All of the examples seem to be carefully
hand-constructed.
Since these URLs are compatible with existing browsers, as you say,
there should not be any difficulty in people running their web servers
this way. Do any web servers in Japan use hex-encoded UTF-8-encoded
Unicode for URLs?

The problem with recommending this method for "Draft Standard" is
not the "six month delay" it takes in getting to draft standard,
it's that we should not recommend something that people aren't
actually going to do. This is not some kind of nit-picky technical
objection, it's fundamental to the process of Internet standards.

I am eager to actually support internationalization.
Martin's remarks insinuating otherwise were insulting. However,
I think it is counter-productive to foist hex-encoded UTF-8-encoded
URLs (12 bytes to represent one 16-bit Kanji) on the rest of the
world merely because a western European and a Canadian like
the idea. Surely we can find a site in Japan, China, Israel, or
Russia that would support exporting their URLs with hex-encoded
UTF-8-encoded URLs, before believing that this isn't yet
another form of Unicode imperialism. Otherwise, we would just
have a pretend solution to a real problem.

Regards,

Larry
--
http://www.parc.xerox.com/masinter