W3C home > Mailing lists > Public > public-html@w3.org > June 2008

Re: Confusing use of "URI" to refer to IRIs, and IRI handling in the DOM

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 29 Jun 2008 11:03:44 +0200
Message-ID: <48674FF0.7050306@gmx.de>
To: Justin James <j_james@mindspring.com>
CC: 'Smylers' <Smylers@stripey.com>, 'HTML WG' <public-html@w3.org>

Justin James wrote:
> I have not seen a *single* person on this list say, "hey, this is an

OK, here's one.

> important distinction at a functional level". Every person involved here
> (Brian Smith, Julian Reschke, Smylers, Mark Baker, Phillip Taylor, myself)
> all agree that the distinction is meaningless except in one extremely narrow
> use case: people with an intimate knowledge of the URL/IRI/URI spec(s) who
> are implementing something in which the distinction is important.

1. The distinction between RFC3986-URL and RFC3986-URI is not important.

2. The distinction between RFC3986-URI and RFC3987-IRI *is* important, 
because it affects the allowable characters, and introduces dependencies 
on IDNA.

3. The distinction between HTML5-URL and RFC3987-IRI *is* important, because

- it affects the way how identifiers can be delimited; HTML5-URLs can 
contain spaces, thus you can't use spaces to delimit them (consider 
detection of URLs in plain text, such as email),

- mapping of non-ASCII characters in query parts differs from RFC3987-IRI.

> I posit that this use case is irrelevantly small; it only seems to apply to
> people attempting to write applications that implement a particular spec, or
> maybe people writing an "URIBuilder" type library component or something.

It affects anybody who consumes HTML. The fact that HTML5-URLs are 
something different means that you can't use out of the box URI/IRI 
libraries and reminding readers of this spec by *not* using the term URL 
would be helpful.

> To "real world" people, this is Yet Another Spec That Shall Be Ignored. By
> trying to find some way to have all of these slightly different items play
> nicely with each other, we're dancing around the elephant in the room (I
> know, Managerial Speak) which is that there should only be one *RI/L spec.
> So let's stop this silly dance, get with the *RI/L group, and tell them,
> "this is broken, please provide us with 1 unified spec that makes sense."
> But for us to keep trying to Band-Aid the broken *RI/L situation within the
> HTML spec itself is pretty pointless. *RI/L is meta to HTML, and not within
> our purview.

The URI/IRI specs aren't broken. Lots of software implements URI/IRI 
processing, and browsers are only one part of it. You simply can't break 
all the other software by making incompatible changes to these specs.

Browsers do not treat URLs as specified, so the best thing is to write 
down what they do, and try to discourage the incompatible processing.

Best regards, Julian
Received on Sunday, 29 June 2008 09:04:30 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:33 UTC