- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Thu, 21 Jul 2011 21:05:16 +0200
- To: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Cc: "public-iri@w3.org" <public-iri@w3.org>
"Martin J. Dürst", Thu, 21 Jul 2011 20:09:25 +0900: You say 'display'. But you seem have in mind at least 2 different things: A) how the link appears visually (and presumably audibly as well) when one hovers above it with the pointing device. B) what gets copied if the use control-clicks (right-click, in the Windows world?) the link and copies it. So, when you say 'FAIL' below, I assume that you meant that it failed both A) and B). If either A) or B) works, then you would not say 'FAIL'. (Because otherwise, I don't understand your interpretation of the results - see below.) > [...] http://www.w3.org/People/D%FCrst. If > interpreted as the page encoding (iso-8859-1 in the test) this might > look like http://www.w3.org/People/Dürst, but this would not be > interoperable because if I copy http://www.w3.org/People/Dürst and > input it again in another browser, it will go somewhere else. Hm. It should be perfectly interoperable to both *display* and *execute* and *copy* the link as ~/Dürst ! The important matter should be the encoding of the name of the Web resource located at ~/Dürst. E.g. let us assume that the name of the resource is Unicode (UTF-8 or UTF-16) encoded. Then, whether the user copies ~/Dürst from a link in a page that is ISO-8859-1 encoded or from one a link in a page that is UTF-8 encoded, should not matter at all: Internally, in the browser, the letter 'ü' would in either case be the same letter! > On the other hand, in the second test, where the href attribute is > http://www.w3.org/People/D%C3%BCrst, it's okay to show > http://www.w3.org/People/Dürst, because that's using UTF-8, but it's > not okay to show http://www.w3.org/People/Dürst, because that would > lead to double encoding (i.e. an URI of > http://www.w3.org/People/D%C3%83%C2%BCrst) when used again somewhere > else. It follows from what I said above that it should be OK, when hovering above the link, to "display" it as 'Dürst' in either case (that is: both Test 1 and Test 2, below). Really, browsers should treat links that point to somewhere in the same page, different from links that points to an external page: For links to fragments on the same page, they should follow the 'use the page's internal encoding' approach. (So says HTML5, at least.) But when a link points to an external resource, then it would be better to assume that the resource has a Unicode encoded name and uses UTF-8 internally - thus, for external links, the browser should convert from the current's page encoding to UTF-8. > Now for how the various browsers did in my test today: > > Opera (11.50, Build 1074, Win7): > Test 1: http://www.w3.org/People/Dürst (FAIL) > Test 2: http://www.w3.org/People/Dürst (FAIL) I don't agree that Opera fail any more than any other browser. In fact, if focus is 'display', then its treat ment of Test 1 is exemplarly. For Test 1, when you hover above the link, then it displays ~/Dürst, which should be perfectly all right - this is (in fact) the most "napkin-compatible" display! The actual *problem* in Opera's treatment of Test 1 is not that it displays ~/Dürst but that, when you ctrl-click/right-click (or just click) the link in order to copy it (or follow it), then you get ~/D%FCrst instead of ~/Dürst. For Test 2, then Opera shows the opposite problem: the link gets copied and exectued correctly, but when you hover above it, then it renders meaninglessly - it doesn't even display the correct percent encoding. > Firefox (5.0, Win7): > Test 1: http://www.w3.org/People/Dürst (FAIL) > Test 2: http://www.w3.org/People/Dürst (PASS) The Test 1 behaviour is identical with that of Opera. Thus, the problem for Test 1 is not the 'hover display' but rather how it gets copied and executed. For Test 2, then it displays ~/Dürst. But it *copies* ~/D%C3%BCrst, which is OK, but how napkin-compatible is it? Why not rather copy it as ~/Dürst? That I don't really get. > IE (8.0.7601.17514, Win7): > Test 1: http://www.w3.org/People/D%FCrst (PASS) > Test 2: http://www.w3.org/People/D%C3%BCrst (PASS*) (No time to test right now.) > Chrome (12.0.742.122, Win7): > Test 1: http://www.w3.org/People/D%FCrst (PASS) > Test 2: http://www.w3.org/People/Dürst (PASS) In my book, Chrome fails Test 1 because it both renders and executes it as ~/D%FCrst, which is neither napkin-compatible nor Web-compatible. > Safari (5.0.4 (7533.20.27)): > Test 1: http://www.w3.org/People/D%FCrst (PASS) > Test 2: http://www.w3.org/People/Dürst (PASS) Same problem for Test 1 as in Chrome. > Chrome and Safari do the right thing from an IRI perspective. I would like to know why you think so. > IE is > okay, but from an IRI perspective, it might try harder for Test 2. > Firefox gets it half wrong, and Opera gets it fully wrong. > > This is a test where arguing about deployed base isn't as important > as thinking about first principles (IRIs get escaped/unescaped using > UTF-8), because we need interoperability via copy-paste and via > write-down-to-napking-and-input-back-into-address-bar. Ah, above I only mentioned ctr-click/right-click. Is it your goal that ~/D%C3%BCrst should be copied as ~/Dürst ? Chrome does hte opposite thing: If you try to open <file:///Dürst>, and the copies URL from the URL bar again, then it gets copied as <file:///D%C3%BCrst>. Chrome also shows as page saying something like "Don't find the address file:///D%C3%BCrst" - and Safari does the same. Opera and Firefox are more sensible, they say "Doesn't find the address file:///Dürst". > For the failed > tests, Opera and Firefox fail on their own terms (keyboarding in the > address as it was displayed leads to different page than the original > link). > > I haven't yet figured out how this kind of test could be automated, > but maybe somebody has an idea. If there is some javascript > functionality that makes sure the status bar is activated and then > can access it's content, that should do the job. It is necessary with separat tests for * display/rendering of fragment links * display/rendering of external links * display of href=Dürst vs href=Dürst It is necesary to test for * copy/paste of to/from URL bar * how error pages/messages are displayed * how ctrl-click (right-click) copy works * how execution works, in coparison For external links tests, then it is necessary to state whether one links to an resource whose file name * is Unicode encoded * follows NFC normalization I also think that one should have tests, for both internal and external links, of how links which follow NFD normalizaiton is handled as well as how resources whose file name is NFD normalized is handled. Idea: It might make sense to display characters that are not NFC-normalizd as percent encoded. That way authors/users get a way to check whether htey have in faced used a valid, napkin-compatible etc, NFC normalized link or not. For testing of page internal (that is #fragment links), you could create an ISO-8859-1 encoded page which contains links to directly typed fragments whose first letter begins with a non-ASCII letter from the ISO-8859-1 charset. And then you can test how that same page works if served/interpreted as another legacy, 8-bit encoding, such s KOI8-R etc. This test should compare wheter, for instance, in a ISO-8859-1 page, href="#Dürst" would hit both id="Dürst" and id="Dürst". -- Leif H Silli
Received on Thursday, 21 July 2011 19:06:03 UTC