Re: How browsers display IRI's with mixed encodings

Martin J. Dürst 25/7/'11,  13:55:
> On 2011/07/23 5:17, Leif Halvard Silli wrote:
> 
>> It is one thing that %FC needs to work (in some sense - like
>> quirks-mode pages also have to work even if it is not valid). But if
>> there is no good necessary usecase for %FC, then we should help authors
>> avoid problems by encourage validators to warn against it use.
> 
> There's nothing invalid with %FC.

My suggestion was that it should *become* invalid/get a warning in - let's say - HTML5 docs.

> A URI that contains %FC is perfectly valid (check RFC 3986). Because it's a valid URI, it's also a valid IRI.

But an author which -today- inserts %FC is likely to do a mistake - or at least make a bad choice, no?
 
> And it's useful in some circumstances. Imagine a server where all the resource names are encoded in iso-8859-1 (or any other legacy (single-byte) encoding). What you tell http (or whatever other scheme/protocol) by using %FC is that you want the resource with the name with the <0xFC> byte in it.

How common are such servers these days? 

My focus is authors. And of course it could be the author meant %FC. But might it not more often be simply a result of a bad %-encoder or on a misconception?

Leif
 

Received on Tuesday, 26 July 2011 21:13:16 UTC