Re: URL character set

Martin J. Duerst (
Thu, 05 Mar 1998 20:04:47 +0900

Message-Id: <>
Date: Thu, 05 Mar 1998 20:04:47 +0900
To: Larry Masinter <>,
From: "Martin J. Duerst" <>
Cc: uri <uri@Bunyip.Com>
In-Reply-To: <023f01bd46e5$19a46060$>
Subject: Re: URL character set

At 12:44 98/03/03 PST, Larry Masinter wrote:
> Al Gilman
> # The restriction to the current RFC-822-header-safe subset of
> # ASCII is temporary under the plans as I hear them.  But it does
> # not make sense to open this up to a schemewise free-for-all or the
> # clients will choke on the necessary library.  Saying that some
> # clients will support some schemes defeats the purpose.  The point
> # of URIs is so that more clients can support more schemes.
> # I think that
> # "Character Set" Considered Harmful
> #
> # may be relevant here.
> Check out 
> I think  we might want to remove the idea of a 'new kind of URL', though,
> and call it an EURI. If I get comments this week, I'll try to incorporate 
> them in a revised version.

I am very glad to hear that you are planning to do a revision.
I started work on this, but I never got very far, due to my
move from Switzerland to Japan. I hope to have more time soon.

For the revision, can you please make sure that you mention
all occasions where UTF-8 is already suggested or required
for URIs? According to my knowledge, that is:

- URNs (the syntax draft)
- New URL schemes in general (the process draft)
- FTP (the ftp i18n draft)
- The HTML 4.0 W3C Recommendation, in particular,
- The XML 1.0 W3C Recommendation, in particular

And maybe others.

Regards,   Martin.