Re: URL comparison

On Thu, Apr 25, 2013 at 4:34 AM, Anne van Kesteren <annevk@annevk.nl> wrote:
> Background reading: http://dev.w3.org/csswg/selectors/#local-pseudo
> and http://url.spec.whatwg.org/
>
> :local-link() seems like a special case API for doing URL comparison
> within the context of selectors. It seems like a great feature, but
> I'd like it if we could agree on common comparison rules so that when
> we eventually introduce the JavaScript equivalent they're not wildly
> divergent.

My plan is to lean *entirely* on your URL spec for all parsing,
terminology, and equality notions.  The faster you can get these
things written, the faster I can edit Selectors to depend on them. ^_^

> Requests I've heard before I looked at :local-link():
>
> * Simple equality
> * Ignore fragment
> * Ignore fragment and query
> * Compare query, but ignore order (e.g. ?x&y will be identical to
> ?y&x, which is normally not the case)
> * Origin equality (ignores username/password/path/query/fragment)
> * Further normalization (browsers don't normalize as much as they
> could during parsing, but maybe this should be an operation to modify
> the URL object rather than a comparison option)
>
> :local-link() seems to ask for: Ignore fragment and query and only
> look at a subset of path segments. However, :local-link() also ignores
> port/scheme which is not typical. We try to keep everything
> origin-scoped (ignoring username/password probably makes sense).

Yes.

> Furthermore, :local-link() ignores a final empty path segment, which
> seems to mimic some popular server architectures (although those
> ignore most empty path segments, not just the final), but does not
> match URL architecture.

Yeah, upon further discussion with you and Simon, I agree we shouldn't
do this.  The big convincer for me was Simon pointing out that /foo
and /foo/ have different behavior wrt relative links, and Anne
pointing out that the URL spec still makes example.com and
example.com/ identical.

> For JavaScript I think the basic API will have to be something like:
>
> url.equals(url2, {query:"ignore-order"})
> url.equals(url2, {query:"ignore-order", upto:"fragment"}) // ignores fragment
> url.equals(url2, {upto:"path"}) // compares everything before path,
> including username/password
> url.origin == url2.origin // ignores username/password
> url.equals(url2, {pathSegments:2}) // implies ignoring query/fragment
>
> or some such. Better ideas more than welcome.

Looks pretty reasonable.  Only problem I have is that your "upto" key
implicitly orders the url components, when there are times I would
want to ignore parts out-of-order.

For example, sometimes the query is just used for incidental
information, and changing it doesn't actually result in a "different
page".  So, you'd like to ignore it when comparing, but pay attention
to everything else.

So, perhaps in addition to "upto", an "ignore" key that takes a string
or array of strings naming components that should be ignored?

This way, :local-link(n) would be equivalent to:
linkurl.equals(docurl, {pathSegments:n, ignore:"userinfo"})

:local-link would be equivalent to:
linkurl.equals(docurl, {upto:"fragment"})  (Or {ignore:"fragment"})

~TJ

Received on Thursday, 25 April 2013 17:38:53 UTC