- From: Erik van der Poel <erikv@google.com>
- Date: Mon, 14 Sep 2009 12:52:36 -0700
- To: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Cc: Anne van Kesteren <annevk@opera.com>, "public-iri@w3.org" <public-iri@w3.org>
On Sat, Sep 12, 2009 at 2:17 AM, "Martin J. Dürst"<duerst@it.aoyama.ac.jp> wrote: > On 2009/09/11 23:24, Erik van der Poel wrote: >> >> On Fri, Sep 11, 2009 at 2:11 AM, Anne van Kesteren<annevk@opera.com> >> wrote: >>> >>> On Fri, 11 Sep 2009 10:57:40 +0200, Martin J. >>> Dürst<duerst@it.aoyama.ac.jp> wrote: >>>> >>>> This essentially says that you MAY send data from a form in the document >>>> encoding, and was followed well by browsers. It seems that some browser >>>> implementer along the way extended that to query parts in other URIs >>>> (which >>>> don't have anything to do with<form>), and got stuck with it. >> >> I don't know which browser version first did this. I don't even know >> which vendor. It may have been Netscape or MSIE (or ...). >> >> But look at it from the point of view of the server. The server cannot >> tell whether the incoming URI is from an HTML<form> or an HTML<a>. >> So it would be nice if the browser treated both the same way. > > Of course. But it's not just forms (where we have accept-charset) and and > <a>. It's also stuff typed directly into an address bar, where there is not > much choice except to assume UTF-8. > > So whichever way, the server may have a tough job. Yes, that is another source. My guess is that servers will receive these URLs from HTML <form> and <a> much more often than the UI address bar, since users don't like to type such long strings, and the query part is at the end of a pretty long string. Anyway, we seem to agree that it is too late to change this now. Erik >> (Unfortunately, this means that<a href="..."> with query parts that >> have nothing to do with HTML forms (i.e. name=value[&] pairs) also get >> converted back to the original encoding, but I agree with Anne that it >> is too late now to change this.) >> >>>> This is especially unlucky as for<form>, you can just say >>>> accept-charset='utf-8', and get all the data sent in UTF-8, >> >> Does this work in all major browsers? (I don't know; I haven't tested it.) >> >> Erik >> >> > > -- > #-# Martin J. Dürst, Professor, Aoyama Gakuin University > #-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp >
Received on Monday, 14 September 2009 19:53:18 UTC