W3C home > Mailing lists > Public > public-html@w3.org > June 2008

Re: Confusing use of "URI" to refer to IRIs, and IRI handling in the DOM

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 29 Jun 2008 11:32:49 +0200
Message-ID: <486756C1.2080304@gmx.de>
To: Ian Hickson <ian@hixie.ch>
CC: 'HTML WG' <public-html@w3.org>

Ian Hickson wrote:
> On Sun, 29 Jun 2008, Julian Reschke wrote:
>> 3. The distinction between HTML5-URL and RFC3987-IRI *is* important, because
>>
>> - it affects the way how identifiers can be delimited; HTML5-URLs can contain
>> spaces
> 
> No, they're not allowed to contain spaces.

*Valid* URLs aren't, but the spec spends a considerable amount of space 
dealing with invalid ones. I understand the intention and the 
difference, but for all practical purposes, people already put spaces 
into URLs, and this "works" in UAs (in that they convert to %20).

>> thus you can't use spaces to delimit them (consider detection of URLs in 
>> plain text, such as email),
> 
> HTML5 uses spaces to delimit URLs in at least two places (ping="" and the 
> cache manifest fallback lines).

Good to know.

>> - mapping of non-ASCII characters in query parts differs from RFC3987-IRI.
> 
> Only in non-conforming documents.

(In which case documents with valid IRIs get non-conforming when using 
the wrong document encoding...)

Anyway. It's nice to know (and a good thing) that "valid URLs" in HTML5 
are valid IRIs, but the biggest part of HTML5's new URL section is the 
definition of handling invalid URLs, so I don't think it makes sense to 
argue that all these differences do not exist.

BR, Julian
Received on Sunday, 29 June 2008 09:33:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:18 GMT