W3C home > Mailing lists > Public > public-iri@w3.org > July 2011

Re: How browsers display IRI's with mixed encodings

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Thu, 28 Jul 2011 14:25:51 +0900
Message-ID: <4E30F2DF.3090603@it.aoyama.ac.jp>
To: Leif H Silli <xn--mlform-iua@xn--mlform-iua.no>
CC: addison@lab126.com, chris@lookout.net, public-iri@w3.org
Hello Leif, others,

On 2011/07/28 5:53, Leif H Silli wrote:
> Phillips, Addison 27/7/'11,  4:13

>> And an author who inserts u-umlaut and expects it to display as
>> u-umlaut and send (as %C3%BC in URI form)? Also valid, IMHO.
> Why did you add 'IMHO'? This should not only be a valid expectation but
> *the* expected behavior? Did not Martin's test show exactly that for the
> directly typed IRI?

I agree that the 'IMHO' is unnecessary.

> Except a bug in Opera etc. Btw, I tested how some text browsers
> interprets a directly typed <a href="ü"> in a ISO-8859-1 encoded page.
> Results: all of them (W3M, Lynx, Links, eLinks, netrik) treated it as
> %FC (and not as %C3%BC)

This is what GUI browsers also did some 10 or more years ago. Text-based 
browsers seem to be behind, probably not only on this issue. I wonder 
how it may be possible to contact the developers of these browsers (if 
they are still under development).

> But I snipped that you said that %FC should be in wide use. And if that
> is the case, then there could be a lot of legacy content out there which
> Firefox is motivated to give a fake character display for, no?
> But how commonly are -or where- e.g. %FC used to point to a
> "ü-resource"? Not often, I think. Non-ascii is avoided, even today.

It's definitely first and foremost ASCII only. After that, I don't have 
any statistics. Maybe somebody from Google has some?

Regards,   Martin.
Received on Thursday, 28 July 2011 05:27:17 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:14:42 UTC