Re: Soliciting feedback on draft-abarth-url

On 4/20/11 4:40 AM, Julian Reschke wrote:
> On 20.04.2011 07:56, Maciej Stachowiak wrote:
>> (1) Parsing into components as exposed by the<a> element and the
>> Location object, among other things.
>> (2) Resolving a possibly-relative reference, relative to a base URL.
>>
>> It's theoretically possible that (2) can be described partly using a
>> component splitting algorithm that is inconsistent with (1). I don't
>> believe this is known to be the case for any existing browser.
>
> OK, let's have a look at the FF4 outcome for the very first test in
> <http://trac.webkit.org/export/HEAD/trunk/LayoutTests/fast/url/segments.html>:
>
> FAIL segments('http://user:pass@foo:21/bar;par?b#c') should be
> ["http:","foo","21","/bar;par","?b","#c"]. Was
> ["http:","foo","21","/bar","?b","#c"].
>
> So apparently FF doesn't report ";par" as part of the path component.

Indeed.  Per RFC 1808 (which is what Gecko implements, pretty much) the 
";par" is not part of the path.  Section 2.1 says:

     <scheme>://<net_loc>/<path>;<params>?<query>#<fragment>

RFC 2396 changes this, moving params into individual path components so 
that they are part of the path.

> But we *do* know that it processes it correctly when following a link.

Sure; when following a link what's sent to the server is the 
concatenation of the path, params and query in the RFC 1808 case and the 
concatenation of the path and query in the RFC 2396 case.

(More precisely, RFC 2616 calls for the thing in the GET request line to 
be either an absoluteURI, which would include the query, or an abs_path 
in RFC 2396 terms or the terms of RFC 2616 section 3.2.2, which would 
not.  This technically means that if you don't send the scheme+host you 
need to not send the query either.  But I don't think anyone takes that 
part of RFC 2616 seriously; the query is sent as part of the request 
line in all the UAs I know of even when the scheme+host is not, and the 
web sort of depends on it.  httpbis part 1 section 4.1 is a lot more 
sane here.)

> So apparently, the DOM API shows a behavior that doesn't apply to that
> URI's handling in general.

The way I would put it is that the DOM APIs show information about a 
parsed representation that is different in different browsers, but the 
serialization algorithms are also different so the final serialized form 
is the same in this case.

-Boris

Received on Wednesday, 20 April 2011 15:09:56 UTC