W3C home > Mailing lists > Public > public-urispec@w3.org > October 2014

Re: resolving the URL mess

From: Sam Ruby <rubys@intertwingly.net>
Date: Fri, 03 Oct 2014 07:18:54 -0400
Message-ID: <542E861E.8090906@intertwingly.net>
To: David Sheets <sheets@alum.mit.edu>, Austin William Wright <aaa@bzfx.net>
CC: Larry Masinter <masinter@adobe.com>, "public-urispec@w3.org" <public-urispec@w3.org>, Anne van Kesteren <annevk@annevk.nl>, John Klensin <klensin@jck.com>
On 10/03/2014 06:39 AM, David Sheets wrote:
> On Fri, Oct 3, 2014 at 2:02 AM, Austin William Wright <aaa@bzfx.net> wrote:
>> On Thu, Oct 2, 2014 at 11:07 AM, Sam Ruby <rubys@intertwingly.net> wrote:
>>> On 10/02/2014 06:05 AM, David Sheets wrote:
>>>>> Anne, Dave Thayer, Sam Ruby, John Klensin come to mind…
>>>> I believe that when we have something to show, we should entice them to
>>>> join us.
>>> +1
>>> At the moment, a non-existent spec doesn't solve any problem that I
>>> currently have.  That's not meant to discourage or encourage you to produce
>>> a spec, but merely an agreement that the time that I would get interested is
>>> when you have something to show.
>>> For what it is worth, examples of problem I do have:
>>> 1) Neither RFC 3986 nor RFC 3987 define the content you will find here:
>>> https://url.spec.whatwg.org/#api
>> I'm fully behind a standard IRI/URI parsing interface. Let's just release a
>> spec that's dedicated to a WebIDL interface for representing and performing
>> operations on URIs/IRIs, though. There's no reason it needs to be combined
>> into RFC3986/7, any more than XML/HTML and the DOM API need to be combined.
> A WebIDL interface for URI manipulation would certainly be desirable
> but I'm uncertain if it should be within the scope of this group.
> Specifically, I worry that attempting to define a WebIDL interface
> before having a specification covering the existing and aspirational
> behavior of deployed systems may lead us astray. I think a WebIDL
> interface need not be bundled with a specification of the functions on
> URI strings and their interpretation, as you say.

Understood, but be aware that unless that work is done, that makes this 
effort less relevant to me.

The HTML specification is going to reference a specification for link 
and anchor href manipulation, and manipulation includes parsing and 
serialization.  Those rules, in turn need to be consistent with how user 
agents actually parse things purported to be links in the wild.

I don't care if those links are called URIs or URLs.  I do care that the 
spec that the HTML specification references defines the interface in 
terms that web browsers implement, and frameworks like node.js emulate.

Certainly, that work could be layered on top of the work defined of the 
work envisioned here.  But only if the rules defined here are compatible.

Here's an analysis of the current incompatibilities between the WHATWG 
definition of URL parsing and the RFC 3986/7 definition of URI parsing:


And here are test results for the WHATWG URL definition:


There is plenty of room for improvement.

- Sam Ruby
Received on Friday, 3 October 2014 11:19:23 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:45:56 UTC