Re: Confusing use of "URI" to refer to IRIs, and IRI handling in the DOM

Ian Hickson wrote:
> On Sun, 29 Jun 2008, Julian Reschke wrote:
>> 3. The distinction between HTML5-URL and RFC3987-IRI *is* important, because
>>
>> - it affects the way how identifiers can be delimited; HTML5-URLs can contain
>> spaces
> 
> No, they're not allowed to contain spaces.

*Valid* URLs aren't, but the spec spends a considerable amount of space 
dealing with invalid ones. I understand the intention and the 
difference, but for all practical purposes, people already put spaces 
into URLs, and this "works" in UAs (in that they convert to %20).

>> thus you can't use spaces to delimit them (consider detection of URLs in 
>> plain text, such as email),
> 
> HTML5 uses spaces to delimit URLs in at least two places (ping="" and the 
> cache manifest fallback lines).

Good to know.

>> - mapping of non-ASCII characters in query parts differs from RFC3987-IRI.
> 
> Only in non-conforming documents.

(In which case documents with valid IRIs get non-conforming when using 
the wrong document encoding...)

Anyway. It's nice to know (and a good thing) that "valid URLs" in HTML5 
are valid IRIs, but the biggest part of HTML5's new URL section is the 
definition of handling invalid URLs, so I don't think it makes sense to 
argue that all these differences do not exist.

BR, Julian

Received on Sunday, 29 June 2008 09:33:34 UTC