Re: Error handling in URIs from Anne van Kesteren on 2008-06-24 (uri@w3.org from June 2008)

From: Anne van Kesteren <annevk@opera.com>
Date: Tue, 24 Jun 2008 15:41:54 +0200
To: "Julian Reschke" <julian.reschke@gmx.de>
Cc: "Ian Hickson" <ian@hixie.ch>, uri@w3.org
Message-ID: <op.uc89n4gl64w2qv@annevk-t60.oslo.opera.com>

On Tue, 24 Jun 2008 14:04:05 +0200, Julian Reschke <julian.reschke@gmx.de>  
wrote:
> Or that the definition needs to be moved into a standalone spec.

I guess I don't really see how that's different from fixing the URI spec.


>>>> The second is with IRIs and character encodings other than UTF-8.  
>>>> While browsers reliably encode non-ASCII characters in the path using  
>>>> UTF-8, non-ASCII characters in the query component are encoded using  
>>>> the document's character encoding, and not UTF-8, which is  
>>>> incompatible with how the IRI spec defines things.
>>>
>>> Could you please be more specific? Any URI is a IRI, so a query  
>>> component based on an encoding other than UTF-8 still is a legal IRI.
>>  It's also transmitted as another encoding than UTF-8 (while the path  
>> component _is_ transmitted as UTF-8).
>
> Yes. It's still a legal URI, thus a legal IRI.

I think the problem is that currently no specification says how to  
construct a URI from a bunch of Unicode characters while taking into  
account that the path component always needs to be in UTF-8 and the query  
component in the document encoding.


-- 
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>

Received on Tuesday, 24 June 2008 13:42:39 UTC