Re: parsing URI (references) according to RFC 3986 from Chris Weber on 2011-06-20 (public-iri@w3.org from June 2011)

From: Chris Weber <chris@lookout.net>
Date: Mon, 20 Jun 2011 00:49:47 -0700
To: Adam Barth <ietf@adambarth.com>
CC: Boris Zbarsky <bzbarsky@mit.edu>, public-iri@w3.org
Message-ID: <4DFEFB9B.5060003@lookout.net>

On 6/20/2011 12:21 AM, Adam Barth wrote:
> On Mon, Jun 20, 2011 at 12:11 AM, Chris Weber<chris@lookout.net>  wrote:
>> I can understand being liberal in accepting "|" characters in the path
>> segment, even though 3986 and 3987bis would have you percent-encode it to
>> "%7C".  But I didn't realize that IE and Chrome would actually perform a
>> transformation on the input in this way.
>
> I wouldn't worry about file URLs for a while.  They're vastly more
> complex than all the other kinds of URLs put together.  If we could
> get interoperability for even just http URLs, I'd be happy.
>
> Adam

I hear you :) I was also thinking of the general differences with a "|" 
in the path segment of an http URL.  Check out the DOM parsing results 
of the following test case:

http://0152.iris.test.ing/foo|bar/

Path		Browser
/foo%7Cbar/	Chrome/12
foo%7Cbar/	MSIE 7.0
/foo|bar/	Opera/9.80
/foo|bar/	Safari/5.0.5
/foo|bar/	Firefox/4.0.1

But the more interesting thing here is that the raw HTTP request doesn't 
match:

Path		Browser
/foo%7Cbar/	Chrome/12
/foo%7Cbar/	MSIE 7.0
/foo|bar/	Opera/9.80
/foo%7Cbar/	Safari/5.0.5
/foo|bar/	Firefox/4.0.1

In this case Safari's DOM 'path' property is different than the raw HTTP 
request 'path' it generates to fetch the resource.  Who's doing the 
right thing here?

-Chris

Received on Monday, 20 June 2011 07:50:20 UTC