Re: How browsers display IRI's with mixed encodings

On 2011/07/23 5:17, Leif Halvard Silli wrote:
> Chris Weber, Fri, 22 Jul 2011 12:00:47 -0700:
>> On 7/22/2011 4:42 AM, Leif Halvard Silli wrote:

>>> Because, if, in a ISO-8859-1 encoded page, hef="D%FCrst" does not work
>>> as well as href="Dürst", then I think HTML5 validators in fact should
>>> warn against use of percent encoding that isn't UTF-8 based.
>>
>> That would probably be ideal but would not provide for raw data that
>> might need to be passed in the IRI, especially the query component.
>
> It is one thing that %FC needs to work (in some sense - like
> quirks-mode pages also have to work even if it is not valid). But if
> there is no good necessary usecase for %FC, then we should help authors
> avoid problems by encourage validators to warn against it use.

See http://www.w3.org/mid/4E2D1050.6030907@it.aoyama.ac.jp.
/People/D%FCrst is a perfectly valid URI. In order for the page author 
to be able to change it to /People/Dürst, the server would have to be 
changed (changing the file/resource name, either manually or maybe by 
installing/configuring something like mod_fileiri,..., i.e. exposing 
legacy-encoded file/resourcenames on the server as UTF-8,...). This is 
not something a can decide.

What a validator might do is to recommend to change /People/D%BC%C3rst 
to /People/Dürst, but that only after we have made sure that IRIs are 
really widely implemented.

Regards,    Martin.

Received on Monday, 25 July 2011 10:24:23 UTC