W3C home > Mailing lists > Public > public-html@w3.org > June 2008

Re: Confusing use of "URI" to refer to IRIs, and IRI handling in the DOM

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 29 Jun 2008 12:03:01 +0200
Message-ID: <48675DD5.1000100@gmx.de>
To: Ian Hickson <ian@hixie.ch>
CC: 'HTML WG' <public-html@w3.org>

Ian Hickson wrote:
> On Sun, 29 Jun 2008, Julian Reschke wrote:
>>>>>> 3. The distinction between HTML5-URL and RFC3987-IRI *is* 
>>>>>> important, because [...] HTML5-URLs can contain spaces
>>>>> No, they're not allowed to contain spaces.
>>>> *Valid* URLs aren't, but the spec spends a considerable amount of 
>>>> space dealing with invalid ones.
>>> Well if you're willing to consider invalid ones, what about invalid 
>>> URIs? They can contain spaces too. What's the distinction between an 
>>> invalid URL and an invalid URI?
>> None? I guess I don't understand the question.
> If valid HTML5 URLs and valid IRIs are equivalent, and invalid HTML5 URLs 
> and invalid IRIs are indistinguishable, then what's the problem?

Valid HTML5 URLs are IRIs.

Invalid HTML5 URLs get special treatment in the spec (note I'm not 
arguing against that treatment). The confusion comes from the fact that 
when the spec says "URL" it really means any URL, not only valid ones.

>> Understood. The alternative (which I think should be seriously 
>> considered) is to break those pages, and to always use UTF-8 for 
>> encoding.
> I can only spec that if browsers are willing to do it. So far, my 
> understanding is that they are not. (There's no point writing a spec that 
> isn't followed, the whole point of the spec is to define what should 
> happen to get interoperability.)

Understood again, but maybe it makes sense to ask the question again, 
now that all browser vendors are actually part of the same specification 

BR, Julian
Received on Sunday, 29 June 2008 10:03:45 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:33 UTC