- From: <Mike_Spreitzer.PARC@xerox.com>
- Date: Tue, 30 Jan 1996 22:06:19 PST
- To: keld@dkuug.dk
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com, Mike_Spreitzer.PARC@xerox.com
I don't understand something about coping with URLs printed in newspapers, business cards, etc. In Unicode, there are multiple ways to code a given character. For example, Unicode includes Latin-1, which includes O-umlaut. Unicode also has an umlaut modifier, so that the same character can be coded as the two-code sequence "umlaut, O". Do people who enter URLs have to be careful to do so in a certain canonical way? Does a server have to canonicalize URLs it receives? What about the other parts of a URL (e.g., FQDN --- does the DNS have to canonicalize lookups)? What about characters that appear similar enough that the printing quality --- and the expertise of the reader --- might not be enough to make the distinction? What about distinctions --- such as that between the Greek letter pi and the math symbol pi --- that are not manifest in a printed glyph?
Received on Wednesday, 31 January 1996 15:13:12 UTC