W3C home > Mailing lists > Public > public-html@w3.org > June 2008

Re: Confusing use of "URI" to refer to IRIs, and IRI handling in the DOM

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 29 Jun 2008 11:42:49 +0200
Message-ID: <48675919.2060807@gmx.de>
To: Ian Hickson <ian@hixie.ch>
CC: 'HTML WG' <public-html@w3.org>

Ian Hickson wrote:
> On Sun, 29 Jun 2008, Julian Reschke wrote:
>> Ian Hickson wrote:
>>> On Sun, 29 Jun 2008, Julian Reschke wrote:
>>>> 3. The distinction between HTML5-URL and RFC3987-IRI *is* important,
>>>> because
>>>> - it affects the way how identifiers can be delimited; HTML5-URLs 
>>>> can contain spaces
>>> No, they're not allowed to contain spaces.
>> *Valid* URLs aren't, but the spec spends a considerable amount of space
>> dealing with invalid ones.
> Well if you're willing to consider invalid ones, what about invalid URIs? 
> They can contain spaces too. What's the distinction between an invalid URL 
> and an invalid URI?

None? I guess I don't understand the question.

>>>> - mapping of non-ASCII characters in query parts differs from 
>>>> RFC3987-IRI.
>>> Only in non-conforming documents.
>> (In which case documents with valid IRIs get non-conforming when using 
>> the wrong document encoding...)
> Right, otherwise documents with valid IRIs but non-UTF-8 encodings 
> wouldn't be treated as per the IRI spec, which is bad (presumably) and 
> shouldn't be encouraged, and should be brought to the author's attention.

Understood. The alternative (which I think should be seriously 
considered) is to break those pages, and to always use UTF-8 for encoding.

BR, Julian
Received on Sunday, 29 June 2008 09:43:32 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:33 UTC