- From: Erik van der Poel <erikv@google.com>
- Date: Mon, 23 Nov 2009 08:47:52 -0800
- To: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Cc: Shawn Steele <Shawn.Steele@microsoft.com>, Larry Masinter <masinter@adobe.com>, "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>, Pete Resnick <presnick@qualcomm.com>, Ted Hardie <ted.ietf@gmail.com>
Hello Martin, I agree that there is no need for a standards track document to prohibit moving to a cleaner state. But in the meantime, do you think we should document the "most prudent thing to do"? Would that be in a BCP? Should that BCP be updated when the most prudent thing to do changes? I also agree that there are many different things that we do with IRIs, URIs and their components. I guess they cannot all be covered by one document. Some of these things should probably be in an HTTP spec, some should be in a mailto: spec, and so on. Maybe that's just stating the obvious. Erik On Mon, Nov 23, 2009 at 3:05 AM, "Martin J. Dürst" <duerst@it.aoyama.ac.jp> wrote: > Hello Erik, > > I agree that for HTTP proxies, and for the Host: header in the HTTP > protocol, at the current point in time, using punycode is the most prudent > thing to do. I don't have any problem putting this as an example into the > new spec, but I don't want this current state of affairs to prohibit > implementations to move to a cleaner state. > > On the other hand, I also have to agree with Shawn. The various ways in > which IRIs and URIs and their components can be used within an application > are simply too many for us to prescribe "one true way" to handle this. > > In my own implementation experience, when I added IDNA support to Amaya, I > relied on it to convert IRIs internally to use %-encoding (without trying to > analyze the IRI further), and then caught that %-encoding deep down in > libwww (the network library on which Amaya relies) and converted it back to > UTF-8 and then to punycode. > > I expect that other applications may do similar things, or they may do > completely different things, because they have a different structure. The > various buggy behaviors that I got when testing %-encoding in domain names > with Firefox and Safari seem to support Shawn's point that internally to the > application, various different forms and conversions may exist. > > Regards, Martin. > > On 2009/11/22 13:24, Erik van der Poel wrote: >> >> On Sat, Nov 21, 2009 at 11:02 AM, Shawn Steele >> <Shawn.Steele@microsoft.com> wrote: >>> >>> I'm still not sure that requiring punicode for URIs is helpful. >>> [...] >>> So saying "you MUST" do .... when converting an IRI to a URI doesn't >>> seem very helpful to me. If IDN use doesn't currently do that already >>> I don't think people are going to change the system, risking >>> instability, to fix (or maybe break) a downgrade scenario for >>> compatibility in older software. >> >> One scenario where an IRI is converted to a URI that contains a host >> name is when a browser is using an HTTP proxy. (When there is no >> proxy, the browser sends a relative URI in the GET request and puts >> the host name in the Host header.) >> >> So I tried IE8 with an HTTP proxy, and it turns out that it converts >> the host name to Punycode. Do you think IE9 should send the host name >> in UTF-8 when using a proxy? What if the proxy is old, and doesn't >> know how to convert from UTF-8 to Punycode? >> >> Erik >> > > -- > #-# Martin J. Dürst, Professor, Aoyama Gakuin University > #-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp >
Received on Monday, 23 November 2009 16:48:28 UTC