- From: Larry Masinter <masinter@parc.xerox.com>
- Date: Wed, 30 Apr 1997 14:31:30 PDT
- To: Francois Yergeau <yergeau@alis.com>
- CC: uri@bunyip.com
Francois, I suggested: ><A HREF="this-is-the-URL">this-is-what-the-user-sees</A> > >The URL in the 'this-is-the-URL' part should use hex-encoded-UTF8, >no matter what the user sees. and you responded: "That would break with current practice. Please see <http://www.alis.com/~yergeau/url-00.html>, section 4 for a discussion of this issue." However, I'm not aware of any current practice that does what section 4 suggests, namely: "This shows the path to be followed with non-ASCII URLs embedded in a text file: simply encode the characters of the URL in the same way as the other characters of the document, i.e. using the CCS of the document. If a character in the URL is not part of the repertoire of this CCS, use URL-encoding of the UTF-8 representation to preserve that character's identity." You would require a different transcoding mechanism for the URL and for the rest of the document. Normally, transcoding a Unicode document in HTML into ISO-8859-1 requires converting characters outside of 0-255 into numeric character references; however, you are suggesting turning URLs into hex-encoded UTF-8 instead. Right? Could you clarify what current practice would "break"?
Received on Wednesday, 30 April 1997 17:33:04 UTC