- From: Keld J|rn Simonsen <keld@dkuug.dk>
- Date: Wed, 31 Jan 1996 17:55:50 +0100
- To: uri@bunyip.com
Glenn Adams writes: > The problem still > exist that not all characters of the world is in ISO 10646. > > Yes, I agree. That is why I proposed a charset tagging syntax to be > intrinsic to URLs. I think there is agreement that we need some charset tagging on URLs. The problem is how to tag it. 1. I am not sure I understand what Glenn is writing here, would intrinsic be in the sense that MIME has for its headers, the ?=charset=? thing? Something along the lines where in the first part of the URL you have the protocol specification userid and password and socket number and the domain, there could be a field for a charset. This would then be an extention of the URL syntax. An example could be after the port number: http://www.dkuug.dk:80:utf-8/maits/ 2. Another place is at the end of a GET / POST request in HTTP, an example: GET http://www.dkuug.dk/maits/ HTTP/1.1 utf-8 3. yet another place could be in headers for the GET request: GET http://www.dkuug.dk/maits/ HTTP/1.1 Url-Charset: utf-8 Discussion: 1. is general to all URL usage, so there would be no need to update protocols. Anyway a server using HTTP/1.0 would not understand this notion, and thus it would create havoc (I think). The other thing is that specifying a charset in a URL is not the right place to do it, it should not be nessecary to specify charsets of urls in newspapers and business cards, as we agreed that URLs were coding independent information. 2. is http specific - It may cause some http/1.0 servers to goof as there is a parameter that it does not expect. 3. should be backwards compatible, as servers may ignore headers they don't understand (as per the http 1.0 spec) and they have a good chance of understanding the URL that is there - possibly in semi-official iso-8859-1 anyway (URLs are 7-bit, http is 8-bit iso-8859-1 per default) So basically there is not much difference between 2. and 3. - they are protocol specific and do not touch URL syntax. I dislike 1. as it implies writing encoding in the URL. keld
Received on Wednesday, 31 January 1996 11:56:15 UTC