Re: URL character set

Martin J. Duerst (duerst@w3.org)
Thu, 05 Mar 1998 20:04:47 +0900


Message-Id: <199803051103.UAA08470@sh.w3.mag.keio.ac.jp>
Date: Thu, 05 Mar 1998 20:04:47 +0900
To: Larry Masinter <masinter@parc.xerox.com>,
From: "Martin J. Duerst" <duerst@w3.org>
Cc: uri <uri@Bunyip.Com>
In-Reply-To: <023f01bd46e5$19a46060$e3d3000d@bronze-208.parc.xerox.com>
Subject: Re: URL character set

At 12:44 98/03/03 PST, Larry Masinter wrote:
> Al Gilman
> # The restriction to the current RFC-822-header-safe subset of
> # ASCII is temporary under the plans as I hear them.  But it does
> # not make sense to open this up to a schemewise free-for-all or the
> # clients will choke on the necessary library.  Saying that some
> # clients will support some schemes defeats the purpose.  The point
> # of URIs is so that more clients can support more schemes.
> 
> # I think that
> 
> # "Character Set" Considered Harmful
> # http://www.w3.org/MarkUp/html-spec/charset-harmful.html
> 
> # may be relevant here.
> 
> Check out ftp://ds.internic.net/draft-masinter-url-i18n-00.txt 
> 
> I think  we might want to remove the idea of a 'new kind of URL', though,
> and call it an EURI. If I get comments this week, I'll try to incorporate 
> them in a revised version.

I am very glad to hear that you are planning to do a revision.
I started work on this, but I never got very far, due to my
move from Switzerland to Japan. I hope to have more time soon.

For the revision, can you please make sure that you mention
all occasions where UTF-8 is already suggested or required
for URIs? According to my knowledge, that is:

- URNs (the syntax draft)
- New URL schemes in general (the process draft)
- FTP (the ftp i18n draft)
- IMAP (the IMAP URL RFC)
- The HTML 4.0 W3C Recommendation, in particular
    http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2,
- The XML 1.0 W3C Recommendation, in particular
    http://www.w3.org/TR/REC-xml#sec-external-ent

And maybe others.

Regards,   Martin.