W3C home > Mailing lists > Public > public-iri@w3.org > July 2011

Re: How browsers display IRI's with mixed encodings

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Mon, 25 Jul 2011 19:22:58 +0900
Message-ID: <4E2D4402.2020809@it.aoyama.ac.jp>
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
CC: Chris Weber <chris@lookout.net>, public-iri@w3.org
On 2011/07/23 5:17, Leif Halvard Silli wrote:
> Chris Weber, Fri, 22 Jul 2011 12:00:47 -0700:
>> On 7/22/2011 4:42 AM, Leif Halvard Silli wrote:

>>> Because, if, in a ISO-8859-1 encoded page, hef="D%FCrst" does not work
>>> as well as href="Dürst", then I think HTML5 validators in fact should
>>> warn against use of percent encoding that isn't UTF-8 based.
>>
>> That would probably be ideal but would not provide for raw data that
>> might need to be passed in the IRI, especially the query component.
>
> It is one thing that %FC needs to work (in some sense - like
> quirks-mode pages also have to work even if it is not valid). But if
> there is no good necessary usecase for %FC, then we should help authors
> avoid problems by encourage validators to warn against it use.

See http://www.w3.org/mid/4E2D1050.6030907@it.aoyama.ac.jp.
/People/D%FCrst is a perfectly valid URI. In order for the page author 
to be able to change it to /People/Dürst, the server would have to be 
changed (changing the file/resource name, either manually or maybe by 
installing/configuring something like mod_fileiri,..., i.e. exposing 
legacy-encoded file/resourcenames on the server as UTF-8,...). This is 
not something a can decide.

What a validator might do is to recommend to change /People/D%BC%C3rst 
to /People/Dürst, but that only after we have made sure that IRIs are 
really widely implemented.

Regards,    Martin.
Received on Monday, 25 July 2011 10:24:23 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:14:42 UTC