Progress on URL spec

Hi IRI folks,

Peter is encouraging me to coordinate my URL work with this working
group.  I'm a bit skeptical, but I'm willing to give it a try.
Currently, the document I'm editing is available on github.  If
coordination with this working group seems to be going well, I'll move
it to an Internet-Draft.

As background, my goal with the work is to produce a precise
specification that describes how browsers ought to process URLs they
find in HTML documents.  In particular, the document will describe how
to parse an absolute URL and how to resolve a string relative to a
base URL, including canonicalization.

The way browsers process URLs is largely constrained by compatibility
with existing web content.  You might find some of the things they do
gross and disgusting, but editorializing about the relative merits of
that behavior is not particularly helpful at this time.

At the URL below, you can find a snapshot of the document.  I believe
this document accurately describes how browsers parse "hierarchal"
URLs, such as those with the http, https, and ftp schemes:

http://github.com/abarth/url-spec/raw/830fe35e0db8db30b5bd43a24a802ab3f4eec8b6/drafts/url.txt

If you believe the document is inaccurate, your feedback will be more
influential if you provide an example URL and an example browser which
you believe behaves differently than what the document describes.
Also helpful are pointers to test suites that I can run on various
browsers to learn about their behavior.

At this point, I'm not accepting editorial feedback on this document.
There's a mountain of editorial work to do, but I'd like to get the
nuts and bolts down first.  In particular, discussion of whether to
present the requirements in terms of an algorithm or a set of
declarative rules is not particularly helpful at this time.

Kind regards,
Adam

Received on Saturday, 4 September 2010 04:22:33 UTC