- From: Leif H Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Wed, 27 Jul 2011 23:53:37 +0300
- To: addison@lab126.com
- Cc: duerst@it.aoyama.ac.jp, chris@lookout.net, public-iri@w3.org
Phillips, Addison 27/7/'11, 4:13 > Making the literal sequence %FC invalid would be a Bad Thing. Accepted ..snip... >> But an author which -today- inserts %FC is likely to do a mistake - or at least >> make a bad choice, no? > > An author who inserts u-umlaut and expects to get %FC is making a mistake. Yes. > An author who inserts %FC and expects to see u-umlaut is making a mistake (or should be). Depends on what you mean by 'expect'. But I guess we agree. > But an author who inserts %FC because that's what her server expects? Valid. I see the point. >And an author who inserts u-umlaut and expects it to display as u-umlaut and send (as %C3%BC in URI form)? Also valid, IMHO. Why did you add 'IMHO'? This should not only be a valid expectation but *the* expected behavior? Did not Martin's test show exactly that for the directly typed IRI? Except a bug in Opera etc. Btw, I tested how some text browsers interprets a directly typed <a href="ü"> in a ISO-8859-1 encoded page. Results: all of them (W3M, Lynx, Links, eLinks, netrik) treated it as %FC (and not as %C3%BC) So a bad story for IRI links in legacy encodings there ... in contrast to the situation for GUI browsers. ..snip... >> My focus is authors. And of course it could be the author meant %FC. But might >> it not more often be simply a result of a bad %-encoder or on a misconception? >> > > The problem, as I see it, is not with the sequence %FC. It is with the character U+00FC appearing in an HTML document inside a URI path. > > I tend to think that the interpretation of %FC using page encoding is bad because an IRI (or URI) lacks the necessary context to make that determination. I agree with Boris's earlier message on the list that showing %FC is a bad user experience. But shouldn't we be trying to close on a well-defined set of behaviors that content authors (and others) can understand? +1 amen to dropping displaying %FC according to page encoding. > I think such an approach would include the behavior described above, even at the expense of some usability. And who looks at those really long URIs full of percent gunk anyway? :-)) Agree. But I snipped that you said that %FC should be in wide use. And if that is the case, then there could be a lot of legacy content out there which Firefox is motivated to give a fake character display for, no? But how commonly are -or where- e.g. %FC used to point to a "ü-resource"? Not often, I think. Non-ascii is avoided, even today. -- Leif H Silli
Received on Wednesday, 27 July 2011 20:54:30 UTC