W3C home > Mailing lists > Public > public-urispec@w3.org > October 2014

Re: resolving the URL mess

From: Austin William Wright <aaa@bzfx.net>
Date: Thu, 2 Oct 2014 18:02:52 -0700
Message-ID: <CANkuk-U4-9Q35AqJNEDn6U7SY5Y0NPe=GF5BZoFgu1Ka=e5nKQ@mail.gmail.com>
To: Sam Ruby <rubys@intertwingly.net>
Cc: David Sheets <sheets@alum.mit.edu>, Larry Masinter <masinter@adobe.com>, "public-urispec@w3.org" <public-urispec@w3.org>, Anne van Kesteren <annevk@annevk.nl>, John Klensin <klensin@jck.com>
On Thu, Oct 2, 2014 at 11:07 AM, Sam Ruby <rubys@intertwingly.net> wrote:

> On 10/02/2014 06:05 AM, David Sheets wrote:
>
>>
>>  Anne, Dave Thayer, Sam Ruby, John Klensin come to mind…
>>>
>>
>> I believe that when we have something to show, we should entice them to
>> join us.
>>
>
> +1
>
> At the moment, a non-existent spec doesn't solve any problem that I
> currently have.  That's not meant to discourage or encourage you to produce
> a spec, but merely an agreement that the time that I would get interested
> is when you have something to show.
>
> For what it is worth, examples of problem I do have:
>
> 1) Neither RFC 3986 nor RFC 3987 define the content you will find here:
> https://url.spec.whatwg.org/#api


I'm fully behind a standard IRI/URI parsing interface. Let's just release a
spec that's dedicated to a WebIDL interface for representing and performing
operations on URIs/IRIs, though. There's no reason it needs to be combined
into RFC3986/7, any more than XML/HTML and the DOM API need to be combined.

Not to mention, the referenced document misuses several well-defined terms.
I'm not sure what a "pathname" or "hash" is, it probably intends "path"
and/or "hierpart", and "fragment".


>
>
> 2) The WHATWG URL Living Standard makes a large number of normative
> statements, particularly concerning parsing, that do not reflect current
> implementations.
>
>
Most libraries, including mine, implement RFC3986 and RFC3987 to the
letter. And many programs _depend_ on this behavior.

My proposal for this CG was to do a formal survey of implementations and
determine compatibility, and incompatibility. By and large I suspect
violations of RFC3986 are the exception rather than the rule; that those
violations tend to be isolated occurrences; and that convergence of
behavior is easy to implement, especially for well-formed URIs and URI
References.

The survey would be started by examining the actual code or logic of all
the known parsers, and crafting a test suite that covers all of their
branches. The survey would be value-free: For instance, one application
might want to raise an error on an invalid character; another might want to
split URIs by whitespace; another might find it desirable to encode to "+"
or "%20" or "_" depending on the context; we can't really say, and it
doesn't really matter for the purpose of conducting the survey.

So here are my proposals for first deliverables:

1. URI/IRI API
2. Survey of implementations

Austin Wright.
Received on Friday, 3 October 2014 01:03:20 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:45:56 UTC