Re: Error handling in URIs from Julian Reschke on 2008-06-24 (uri@w3.org from June 2008)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Tue, 24 Jun 2008 21:32:50 +0200
To: Ian Hickson <ian@hixie.ch>
CC: uri@w3.org
Message-ID: <48614BE2.6030608@gmx.de>

Ian Hickson wrote:
>> Could you please be more specific? Any URI is a IRI, so a query 
>> component based on an encoding other than UTF-8 still is a legal IRI.
> 
> The IRI spec would have the query component always encoded as UTF-8, as I 
> understand it.

IRIs consist of Unicode characters. UTF-8 only enters the picture when 
an IRI is converted to a URI.

If you start with a URI *or* IRI, and then append query parameters, you 
always have the choice not to use non-ASCII characters, and to decide 
yourself what character encoding to use before percent-escaping.

Sure, that's not pretty, but it yields both a legal URI and IRI.

Now, that being said, is there anything HTML5 could do so we can get 
closer to a strict UTF-8 world in the future? Such as allowing servers 
to serve document in an encoding != UTF-8, but still get query 
parameters to be consistently encoded in UTF-8?

BR, Julian

Received on Tuesday, 24 June 2008 19:33:41 UTC