RE: Inclusions and other gotchas (was:Re: inclusion) from Larry Masinter on 2000-05-25 (xml-uri@w3.org from May 2000)

From: Larry Masinter <masinter@attlabs.att.com>
Date: Thu, 25 May 2000 16:37:13 -0700
To: <keshlam@us.ibm.com>, <xml-uri@w3.org>
Message-ID: <NDBBKEBDLFENBJCGFOIJOEDJCMAA.masinter@attlabs.att.com>

> In other words: as far as I can tell from a much-too-brief scan, the
> official test for "does URI1 equal URI2" only returns "yes" and "maybe" --
> there seems to be no official way to answer "definitely not" without
> attempting to dererence them, and even that doesn't seem to be completely
> reliable.

Let me suggest you refine your terminology, just to avoid some confusion
here. A "URI" is, after all, a resource identifier. The question is whether
two different resource identifiers identify the same resource. Using
a convenient shorthand to ask "does URI1 equal URI2" using the word
'equal' makes it hard to discuss the relationship between different kinds
of equivalence relationships. So I'd suggest being a bit more careful
with the terminology.

For example, let's define a relationship 'sra' meaning "same resource as",
where sra(URI1, URI2) is true if the resources identified by them are the
same. We know that for absolute URI references that 'string-equal(URI1,URI2)
implies sra(URI1, URI2); for relative URI references, string-equal isn't
enough, unless the base is the same.

Unfortunately, there is no practical way of computing 'sra' precisely,
although there are many cases where true or false can be predicted,
e.g., sra(URI1, URI2) is true if they are both http URLs where
the host names differ only by case.

There are a few areas where we can predict that sra(URI1, URI2) is false,
but not many. HTTP URLs can be aliased. It would be hard to claim that
you could create a system where a 'mailto' URI was sra a 'http' URL.
Two different data URLs whose data wasn't equal after undoing the
content-encoding are not sra, etc.

> [I think I'm finally starting to understand what Tim is driving at, in
> terms of the namespace declaration only declaring a _reference_ to a
> (family of) point(s) in URI-space, rather than stating the actual "name"
of
> a specific Namespace. I'm still trying to convince myself that it's
> actually practical (as opposed to possible or even desirable) to define
> Namespaces that way. No answers yet, but some new questions...]

The point is to create a computationally practical way of predicting
whether a receiver 'understands' the XML document it's given by
examining the namespace it's sent. In practice, senders shouldn't
rely on receivers having an effective approximation to 'sra' other than
the one that returns 'true' when the URIs are string-equal and returns
'unknown' when they're not string equal.

If the question they're asking is "do I know this namespace?" and their
namespace equivalence function is returning "unknown", then the answer
to "do I know?" is "no".

One practical way out is for those who make up namespace
names to avoid using sra-equivalent but non-string-equal URIs, and
to choose, as namespace names, URIs which will not be used by anyone
else. W3C recommendations that define namespaces typically contain
the URI used for the namespace name, and the form of the URI used
in the recommendation is the one that should be used by everyone,
even if there might be sra-equivalent forms.

Received on Thursday, 25 May 2000 19:51:13 UTC