- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Thu, 21 Jul 2011 20:09:25 +0900
- To: "public-iri@w3.org" <public-iri@w3.org>
When browsing for something else, I recently ended up at http://www.w3.org/2001/08/iri-test/LinkShow.html, a test I had made up just about 10 years ago. It checks how browsers display URIs with %-encoding in them. This display happens e.g. in a status bar (e.g. Safari) or in a popup (e.g. Opera when the status bar is not active). The idea is that because %-encoding in URIs has to be interpreted as UTF-8 when converting to IRIs, in the first test, which has http://www.w3.org/People/D%FCrst in the href attribute, the only way to display this is as http://www.w3.org/People/D%FCrst. If interpreted as the page encoding (iso-8859-1 in the test) this might look like http://www.w3.org/People/Dürst, but this would not be interoperable because if I copy http://www.w3.org/People/Dürst and input it again in another browser, it will go somewhere else. On the other hand, in the second test, where the href attribute is http://www.w3.org/People/D%C3%BCrst, it's okay to show http://www.w3.org/People/Dürst, because that's using UTF-8, but it's not okay to show http://www.w3.org/People/Dürst, because that would lead to double encoding (i.e. an URI of http://www.w3.org/People/D%C3%83%C2%BCrst) when used again somewhere else. [Because of the shameless self-reference and some behind-the-scenes trickery for backwards compatibility, the actual pages referenced are: for http://www.w3.org/People/D%C3%BCrst: my (now historical) people page at W3C for http://www.w3.org/People/D%FCrst: same as above, but via a redirect with some explanations about IRIs for http://www.w3.org/People/D%C3%83%C2%BCrst: the W3C 404 page, because the actual page doesn't exist Please keep the distinction between the first two in mind when trying things out.] Now for how the various browsers did in my test today: Opera (11.50, Build 1074, Win7): Test 1: http://www.w3.org/People/Dürst (FAIL) Test 2: http://www.w3.org/People/Dürst (FAIL) Firefox (5.0, Win7): Test 1: http://www.w3.org/People/Dürst (FAIL) Test 2: http://www.w3.org/People/Dürst (PASS) IE (8.0.7601.17514, Win7): Test 1: http://www.w3.org/People/D%FCrst (PASS) Test 2: http://www.w3.org/People/D%C3%BCrst (PASS*) Chrome (12.0.742.122, Win7): Test 1: http://www.w3.org/People/D%FCrst (PASS) Test 2: http://www.w3.org/People/Dürst (PASS) Safari (5.0.4 (7533.20.27)): Test 1: http://www.w3.org/People/D%FCrst (PASS) Test 2: http://www.w3.org/People/Dürst (PASS) Chrome and Safari do the right thing from an IRI perspective. IE is okay, but from an IRI perspective, it might try harder for Test 2. Firefox gets it half wrong, and Opera gets it fully wrong. This is a test where arguing about deployed base isn't as important as thinking about first principles (IRIs get escaped/unescaped using UTF-8), because we need interoperability via copy-paste and via write-down-to-napking-and-input-back-into-address-bar. For the failed tests, Opera and Firefox fail on their own terms (keyboarding in the address as it was displayed leads to different page than the original link). I haven't yet figured out how this kind of test could be automated, but maybe somebody has an idea. If there is some javascript functionality that makes sure the status bar is activated and then can access it's content, that should do the job. Regards, Martin.
Received on Thursday, 21 July 2011 11:10:49 UTC