Comparing URI references as strings

> If you don't agree that there is a problem with allowing relative
> URIs but comparing them either as strings...
> please continue this under a different subject line!


Consider two HTML documents (well formed, so they can be read as XML)
both containing <a href="foo.html">...</a>

An SGML parser such as nsgmls will return the same result for
<a href="foo.html">...</a>
irrespective of the base URI.

The XSL test

<xsl:if
test="document('http://www.example.com/x/document1.html')//a/@href"
       =
      document('http://www.example.com/y/document2.html')//a/@href">

will compare the href attributes as strings, returning true even
though following the links will retrieve different resources.

Neither of these things is claimed to break web architecture.
But the behaviour here is exactly the same as the current behaviour
specified for namespace names.

URI references are _always_ compared as strings until such time as
an application decides to take the URI reference and the current base
URI and construct an absolute URI via the algorithm in the RFC.

If you use a namespace name to refer to a resource, it works exactly
the same way as HTML <a href. That is, if the URI reference is relative
the resource identfied will depend on the base URI of the document.

In both cases, if you need to ensure that it is the case that two
URI references that are string equal refer to the same resource
then you had better arrange that the resources be made absolute with
respect to the same base URI. In the case of XSL this is particularly
easy as the document() function takes a second argument that allows
the effective base URI to be controlled.

This comes down to Jonathan Marsh's suggestion:

If you want to retrieve the resource whose URI is used as the
namespace name in such a way that you always get the same resource
from the same namespace name, just do this in XSL or an equivalent
thing in whatever system you are using.


document(<namespace name>, document('http://www.w3.org/'))

if <namespace name> is an absolute URI then the second argument
has no effect.
If <namespace name> is relative then you get the resource identified
by making the namespace name absolute with that base.
(There may not be any resource at that absolute URI, but that is the
same situation as in the absolute case. Even if the namespace name
is absolute there is no guarantee that any resource may be accessed
using the URI.

Thus the various examples posted of stylesheets where things are
claimed to go wrong with the current literal interpretation of
namespace names are in fact just examples of stylesheets that are
incorrect.

The only problem with using relative URI is that clearly the namespace
name is then not globally unique, which is why the namespace spec
could probably more strongly suggest that you don't do this.
Rather than just saying "to serve its intended purpose" that it should
be unique.

David

Received on Thursday, 1 June 2000 07:21:31 UTC