Re: UTF-8 URL for testing

Francois Yergeau (yergeau@alis.com)
Sun, 13 Apr 1997 21:31:53 -0400


Message-Id: <3.0.1.32.19970413213153.00780d64@genstar.alis.com>
Date: Sun, 13 Apr 1997 21:31:53 -0400
To: masinter@parc.xerox.com
From: Francois Yergeau <yergeau@alis.com>
Subject: Re: UTF-8 URL for testing
Cc: uri@bunyip.com
In-Reply-To: <334DADDC.5CBC@parc.xerox.com>

À 20:19 10-04-97 PDT, Larry Masinter a écrit :
>Does Alis provide its documentation online?

Sorry, no.  Only user manuals.

> Can you
>point us to the place where the use of %-hex encoded
>UTF-8 encoded Unicode in URLs is documented?

I'm not sure I understand correctly what "%-hex encoded UTF-8 encoded
Unicode in URLs" means, but here's a pointer that could be useful:

  http://www.alis.com:8085/~yergeau/url-00.html

>The URLs you point us to are all in your personal
>area (~yergeau) on alis.com.

Normal.  That's the area I control.

> Why aren't any of
>the other URLs on the alis site internationalized,
>since it is compatible with current browsers?

Perhaps it has to do with lack of standardization?  Were there that many
HTTP/1.1 implementations before there was a standard, or at least some
language in a serious draft, to point the way?

>Did you try any browsers that didn't work?

No.  I tried 3 installed on my machine, they all worked.  I thought
publishing the URL would rather quickly get out reports of browsers that
failed.  None so far.

> Do any of the browsers
>display the URLs as anything other than %xx%xx%xx in the 'location' box?

How would they?  Lacking a standard mapping from octets to characters for
xx>127, what characters could they display.

For what it's worth, Tango correctly displays the first link destination in
the status bar when the mouse is over the anchor.

>Is there any software anywhere in the world that actually generates
>URLs like these? All of the examples seem to be carefully
>hand-constructed.

Valid point.  We seem to have the client side settled (works with
unmodified browsers), let's now deal with servers.  You want software, so I
wrote some.  Going to the following place:

  ftp://ftp.alis.com/pub/ietf/url/

you will find a Perl script and two mapping files for two character sets
(ISO 8859-1 and CP850).  Installing the script and one of the mappings on a
Unix server should let one create UTF-8 URLs at will.  Instructions are in
the script.

I've recreated the URL pointed to by the document at

  http://www.alis.com:8085/~yergeau/url_utf8.htm

so that it is *not* anymore a "carefully hand-constructed" URL, but one
created using software purposefully written for the task of creating UTF-8
URLs.  The script is all that's needed, the HTTP server is unmodified.  Not
that this is the only way to go, of course.  So we do have a server-side
implementation now.


-- 
François Yergeau <yergeau@alis.com>
Alis Technologies Inc., Montréal
Tél : +1 (514) 747-2547
Fax : +1 (514) 747-2561