- From: Dan Oscarsson <Dan.Oscarsson@trab.se>
- Date: Wed, 30 Apr 1997 08:52:17 +0200 (MET DST)
- To: uri@bunyip.com, masinter@parc.xerox.com
> Since no one else has, here's a rough draft of a UTF-8 URL > internet-draft, which I intend to submit in a few days time, > after taking another pass on it. > > > ----- > INTERNET-DRAFT Larry Masinter, Xerox Corporation > draft-masinter-url-i18n-00xx April 27, 1997 > Expires: October 27, 1997 > 3.2 Requirements for URL generation and interpretation > > Systems that are offering resources through the internet > where those resources have logical names sometimes offer > the ability to generate URLs for the resources they offer. > For example, some HTTP servers offer the ability to > generate a 'directory listing' for file directories > under their purvue, and then to respond to the generated > URLs with the files. If the names of the files consist > solely of US-ASCII characters, the transcription is > simple, but other file systems offer a wider variety > of characters. It is recommended that the generation > of directories result in hex-encoded UTF-8 for non-USASCII > characters in the listing, and that the interpretation > of URLs accept both the raw UTF-8 or the hex-encoded version. > This is not right. A directory listing service generates a html document that is sent back to the web browser. All URLs within a html document should use the same character set as the document uses. That is, if the document uses iso 8859-1, the URLs will be in iso 8859-1, and if the document is in UTF-8, the URLs will be in UTF-8. If the browser knows how to handle the character set of the html document, it also should know how to translate the embedded URLs into UTF-8 when the user follows a link. In general, URLs used without a context that defines the characters used, should be encoded using UTF-8. URLs used within a context where the meaning of the characters is defined should use the character encoding of the context. Dan
Received on Wednesday, 30 April 1997 02:53:11 UTC