W3C home > Mailing lists > Public > public-iri@w3.org > June 2011

Re: parsing URI (references) according to RFC 3986

From: Adam Barth <ietf@adambarth.com>
Date: Sun, 19 Jun 2011 17:10:17 -0700
Message-ID: <BANLkTi=S=wdAi4xFhFxAh2Y6ajwSyeP1hA@mail.gmail.com>
To: Chris Weber <chris@lookout.net>
Cc: Julian Reschke <julian.reschke@gmx.de>, "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>
On Sun, Jun 19, 2011 at 4:18 PM, Chris Weber <chris@lookout.net> wrote:
> On 6/18/2011 6:09 AM, Adam Barth wrote:
>> How does your implementation compare to existing browsers on this test
>> suite:
>>
>> http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/
>>
>> In particular, it would be helpful to add entries for your
>> implementation to the following table so that we can see whether it
>> makes desirable trade-offs in situations where browsers differ in
>> behavior:
>>
>> https://raw.github.com/abarth/url-spec/master/tests/gurl-results/by-browser.txt
>
> The Webkit test suite seems very valuable for its portability and black-box
> testing capability.  It does have some limitations though in that it's only
> considering the DOM and sometimes only certain properties therein.
>
> I still have a ways to go with my own test suite, but wanted to expand on
> some of test results.  I've used some of your same test cases where I can.
>
> IE canonicalize('http://example.com\\foo\\bar') is
> 'http://example.com/foo/bar'
> KR canonicalize('http://example.com\\foo\\bar') is
> 'http://example.com/foo/bar'
> SA canonicalize('http://example.com\\foo\\bar') is
> 'http://example.com/foo/bar'
> FF canonicalize('http://example.com\\foo\\bar') should be
> http://example.com/foo/bar. Was http://example.com\foo\bar/.
>
> In the above test results, you're comparing against the .href property of
> the DOM element, which is fine and may be all you want.  It may be
> interesting to note some more detail here though.
>
> FF hostname property for this test is "example.com\foo\bar".  Because it's
> an invalid hostname it fails to initiate an HTTP request for this URI and
> doesn't even try to make a DNS request (good).
>
> In a similar test case "http://example.com/foo\bar" both FF and Opera's path
> property in the DOM percent-encode the "\" as "/foo%5Cbar" and the
> corresponding HTTP request matches to become "GET /foo%5Cbar HTTP/1.1".  IE,
> Chrome, and Safari all instead convert the "\" to a "/".  Their DOM path
> property shows "/foo/bar" and the HTTP request matches as "GET /foo/bar
> HTTP/1.1".

Indeed.  The point is that IE, Chrome, and Safari treat \ as if it
were / in parsing URLs whereas Firefox does not.  I suspect we'll want
the spec to say that \ should be treated like / when parsing URLs.

Adam
Received on Monday, 20 June 2011 00:11:15 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:52:01 GMT