Re: Error handling in URIs

On Tue, 24 Jun 2008 14:04:05 +0200, Julian Reschke <>  
> Or that the definition needs to be moved into a standalone spec.

I guess I don't really see how that's different from fixing the URI spec.

>>>> The second is with IRIs and character encodings other than UTF-8.  
>>>> While browsers reliably encode non-ASCII characters in the path using  
>>>> UTF-8, non-ASCII characters in the query component are encoded using  
>>>> the document's character encoding, and not UTF-8, which is  
>>>> incompatible with how the IRI spec defines things.
>>> Could you please be more specific? Any URI is a IRI, so a query  
>>> component based on an encoding other than UTF-8 still is a legal IRI.
>>  It's also transmitted as another encoding than UTF-8 (while the path  
>> component _is_ transmitted as UTF-8).
> Yes. It's still a legal URI, thus a legal IRI.

I think the problem is that currently no specification says how to  
construct a URI from a bunch of Unicode characters while taking into  
account that the path component always needs to be in UTF-8 and the query  
component in the document encoding.

Anne van Kesteren

Received on Tuesday, 24 June 2008 13:42:39 UTC