Re: UTF-8 URL for testing

Francois Yergeau (
Sun, 13 Apr 1997 21:31:53 -0400

Message-Id: <>
Date: Sun, 13 Apr 1997 21:31:53 -0400
From: Francois Yergeau <>
Subject: Re: UTF-8 URL for testing
In-Reply-To: <>

À 20:19 10-04-97 PDT, Larry Masinter a écrit :
>Does Alis provide its documentation online?

Sorry, no.  Only user manuals.

> Can you
>point us to the place where the use of %-hex encoded
>UTF-8 encoded Unicode in URLs is documented?

I'm not sure I understand correctly what "%-hex encoded UTF-8 encoded
Unicode in URLs" means, but here's a pointer that could be useful:

>The URLs you point us to are all in your personal
>area (~yergeau) on

Normal.  That's the area I control.

> Why aren't any of
>the other URLs on the alis site internationalized,
>since it is compatible with current browsers?

Perhaps it has to do with lack of standardization?  Were there that many
HTTP/1.1 implementations before there was a standard, or at least some
language in a serious draft, to point the way?

>Did you try any browsers that didn't work?

No.  I tried 3 installed on my machine, they all worked.  I thought
publishing the URL would rather quickly get out reports of browsers that
failed.  None so far.

> Do any of the browsers
>display the URLs as anything other than %xx%xx%xx in the 'location' box?

How would they?  Lacking a standard mapping from octets to characters for
xx>127, what characters could they display.

For what it's worth, Tango correctly displays the first link destination in
the status bar when the mouse is over the anchor.

>Is there any software anywhere in the world that actually generates
>URLs like these? All of the examples seem to be carefully

Valid point.  We seem to have the client side settled (works with
unmodified browsers), let's now deal with servers.  You want software, so I
wrote some.  Going to the following place:

you will find a Perl script and two mapping files for two character sets
(ISO 8859-1 and CP850).  Installing the script and one of the mappings on a
Unix server should let one create UTF-8 URLs at will.  Instructions are in
the script.

I've recreated the URL pointed to by the document at

so that it is *not* anymore a "carefully hand-constructed" URL, but one
created using software purposefully written for the task of creating UTF-8
URLs.  The script is all that's needed, the HTTP server is unmodified.  Not
that this is the only way to go, of course.  So we do have a server-side
implementation now.

François Yergeau <>
Alis Technologies Inc., Montréal
Tél : +1 (514) 747-2547
Fax : +1 (514) 747-2561