- From: Dan Oscarsson <Dan.Oscarsson@trab.se>
- Date: Wed, 30 Apr 1997 10:45:20 +0200 (MET DST)
- To: masinter@parc.xerox.com
- Cc: uri@bunyip.com
> > This is not right. A directory listing service generates a html document > > that is sent back to the web browser. All URLs within a html document > > should use the same character set as the document uses. That is, > > if the document uses iso 8859-1, the URLs will be in iso 8859-1, and > > if the document is in UTF-8, the URLs will be in UTF-8. > > Dan, for each item in a directory listing, there are two entries. > > <A HREF="this-is-the-URL">this-is-what-the-user-sees</A> > > The URL in the 'this-is-the-URL' part should use hex-encoded-UTF8, > no matter what the user sees. > If you use hex-encoding, yes. But NOT if you use the native character set of the document. In that case, the 'this-is-the-URL' part must use the same character set as the rest of the html document. Raw UTF-8 may only be used in a UTF-8 encoded html document, not in a iso 8859-1 encoded document. A large amount of html documents are hand written in a text editor. A user can not be expected to use a different encoding when typing the URLs in a document. But I agree that if hex-encoded characters are found in a URL they should be UTF-8 otherwise it would be unclear what encoding is used for hex-encoded URLs in a ascii-only html document. But a ascii-only document may not contain any 8-bit characters in a URL as there is no defined character set for them. To use native encoding in URLs in known context and hex-encoded UTF-8 in other places and, if you want, in known context is what I understand others on the list also wants. If we cannot use native encoding when typing in our URLs in our html documents very little is won. Dan
Received on Wednesday, 30 April 1997 04:46:03 UTC