Re: Non-hierarchical base URLs (was Re: draft-abarth-url-01 uploaded) from Roy T. Fielding on 2011-04-28 (public-iri@w3.org from April 2011)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Thu, 28 Apr 2011 00:12:33 -0700
To: Adam Barth <ietf@adambarth.com>
Cc: public-iri@w3.org
Message-Id: <A246AD1A-42B9-4B32-A427-52E184259D2D@gbiv.com>

On Apr 27, 2011, at 11:29 PM, Adam Barth wrote:
> On Wed, Apr 27, 2011 at 10:12 PM, Roy T. Fielding <fielding@gbiv.com> wrote:
>> However, for the subset of possible references that do happen
>> to match what are called valid URI references by RFC3986, then
>> we have already tested consensus and deployed many implementations
>> that conform exactly to the results given in RFC3986.  If you
>> find a difference between that and a single browser's behavior,
>> then that browser has a bug and should be fixed.
> 
> We've had implementors state on this list that they aren't going to do
> that.  In fact, they have stated that they're going to make their
> implementations less conformant with RFC 3986 because the requirements
> in RFC 3986 don't match the requirements they face the in real world.

If what they have is running code, then they are welcome to explain
why it is that their running code needs to differ from everyone else.
What I have heard so far is just conjecture based on invalid test
cases, imagined implementations, and a few point examples wherein
a particular implementation chose to be broken for reasons unknown.

> We can ignore what these implementors want, but that just means
> they'll ignore us.

I do not ignore implementations.  RFC 3986 only defined what
*everyone* agreed to implement, including all of the major
browser vendors at the time it was written.  It has been my
experience so far that all occurrences where an implementation
differs from RFC 3986 in URI parsing or relative resolution
has been shown to be a bug in that implementation and fixed
by that implementor shortly after it was pointed out.  The
reason for such a fix is because 3986 actually does define an
interoperable subset of references, not because the IETF has
any real power to make them conform.

The issues that folks do have with 3986-based parsing and
resolution are all about the handling of invalid references
or the processing of non-URI characters and embedded whitespace.
Those were not defined by the standard because there was no
consistency in implementation whatsoever, and hence no consensus
upon which to declare a standard.  Instead, IRI was produced
as a separate proposal on the standards track to address those
issues, but somewhere along the line it changed from a
presentation/data-entry format for i18n resource references
into an entirely separate protocol element that no protocol
wants to implement.

If we could focus on the non-URI character-handling issues
for processing reference strings into URI references, avoid
redefining everything associated with identifiers just to
support a few non-interoperable edge cases, and fold in the
work on IDNA host processing, then I think we can make some
progress.

....Roy

Received on Thursday, 28 April 2011 07:14:44 UTC