Re: Progress on URL spec

On Sep 3, 2010, at 9:21 PM, Adam Barth wrote:

> 
> 
> At the URL below, you can find a snapshot of the document.  I believe
> this document accurately describes how browsers parse "hierarchal"
> URLs, such as those with the http, https, and ftp schemes:
> 
> http://github.com/abarth/url-spec/raw/830fe35e0db8db30b5bd43a24a802ab3f4eec8b6/drafts/url.txt
> 
> If you believe the document is inaccurate, your feedback will be more
> influential if you provide an example URL and an example browser which
> you believe behaves differently than what the document describes.
> Also helpful are pointers to test suites that I can run on various
> browsers to learn about their behavior.

It's hard to tell if the document is inaccurate by that standard because:

A) "parse" of an arbitrary string is not an observable facet of the Web platform; the only "parse" operation that's actually exposed is the DOM API on the Location object and <a> elements, which implicitly operates on a string that has already been resolved and canonicalized. This algorithm seems to be for an invisible parse step that happens before URLs are resolved+canonicalized (given that it handles some invalid inputs that already would have been cleaned up by the resolve+canonicalize operation). The parse operation that is actually exposed only operates on an already-resolved URL

As a result, I don't see how to make tests that would determine if browsers match the behavior of the algorithm.

B) In cases where browsers have different behavior, there's no documentation of why one or the other was chosen.

Perhaps it would be more fruitful to review once there is enough here to relate to an observable behavior of the Web platform.

Regards,
Maciej

Received on Sunday, 5 September 2010 01:20:15 UTC