Re: When are two URIs equivalent?

>Char-by-char equivalence is too weak for URIs.
> RFC 2396 resolution tells us how to convert relative URIs to absolute,
> which can then be compared char-by-char.

You're confusing URIs and URI References.

Absolutizing deals with converting a URI Reference into a URI.

Char-by-char is fine for URIs, _if_ you ignore embedded-relative and
character-escaping issues.

If you want to deal with those additional points, you need to Canonicalize
the URI. This is starting to get beyond the definition of URIs; the URI
spec mentions canonicalization but says that this process is unique to each
URI Scheme... and there's no bound on how many schemes can be invented, so
this is generally handled on the server side of things. As far as I know
there's no way to ask a server how it would canonicalize a URI even if you
are willing to do a network transaction.

The question of how the server maps Canonicalized  URIs into responses is
yet another layer of interpretation, of course. But that really is beyond
the scope of the URI spec.

Joe Kesselman  / IBM Research

Received on Tuesday, 23 May 2000 11:29:34 UTC