Date: Mon, 21 Apr 1997 12:31:48 +0200 (MET DST) From: "Martin J. Duerst" <firstname.lastname@example.org> To: Larry Masinter <email@example.com> Cc: Gary Adams - Sun Microsystems Labs BOS <Gary.Adams@east.sun.com>, Subject: Re: revised "generic syntax" internet draft In-Reply-To: <3354813C.firstname.lastname@example.org> Message-Id: <Pine.SUN.3.96.970421120702.245F-100000@enoshima> On Wed, 16 Apr 1997, Larry Masinter wrote: > Gary, > > Thanks for going through my questions and giving general answers, > but they were still pretty generic and hand-waving. In lieu of > an actual implementation, could you please go through a couple > of real examples, e.g., for any one of Chinese, Japanese, > Greek, Hebrew. [I'll try to answer an example for Japanese, but first I'll answer the question at the end of Larry's mail.] > How is this supposed to work, and how does hex-encoded UTF-8 encoded > actually help make it work? The question of how *hex-encoded* UTF-8 actually helps is a good one. It is very clear that an advertisement with an URL with lots of %HH in it won't be better than the same with a few English letters in it, even if we know that the %HH are UTF-8 of characters that make a lot of sense. *hex-encoded* UTF-8, however, is an important preparation for really using beyond-ASCII letters in URLs. Without this defined character<->octet conversion, using beyond-ASCII letters will never work. Once UTF-8 is nailed down, it will work rather smoothly. > What about Sanyoo depaarto? What do they print as the URL > for their food shop? How would someone enter that into a browser? Well, they print something like http://WEB.SANYO.CO.JP/FOODSHOP, where upper case is Japanese characters. Of course, for this we have to assume that DNS works with characters beyond ASCII, but that's a separate problem that can be solved (see draft-duerst-dns-i18n-00.txt). This is entered as such into a browser. We assume that those users that are the target of the Sanyoo depaato food shop page can read Japanes and have equipment that allows them to input Japanese. I won't go into the details of entering the corresponding characters, it's a process the Japanese computer users are very familliar with. The browser then would convert the Japanese characters into UTF-8 and (add %HH encoding) and pass the URL to the resolver machinery, where the host part would be resolved with DNS, and then the machine at the corresponding IP number would be contacted with HTTP. That machine would of course have been set up so that the correct page is returned. I hope this explanation is detailled enough. If you don't understand some part of it, please tell us. Regards, Martin.