Mistaken identity?

I'm seeking a little clarification of the nature of identity on the web, 
several related issues involved. In short, here are three little puzzles:

1. Can a resource have multiple different representations of the same type?

2. If two resources have different sets of representations, can they 
ever be considered the same?

3. How can I ever assert:

<http://mydomain.org/a> owl:sameAs <http://yourdomain.org/a> .

when I can't ever be sure that you aren't going to change 
http://yourdomain.org/a ?

Ok, why I think these might be puzzles. Any statement involving 
URI(ref)s presuppose the coolness assumption - the mapping between the 
resource and its identifier won't change. Generally this seems good 
enough in practice to be workable, particularly as the URI will only be 
used as a name. But again in practice it seems to me that there is 
potential for problems when we start talking in terms of two URIs 
identifying the same resource, as in the owl:sameAs above.

Fielding says of URIs [1] :
The naming authority that assigned the resource identifier, making it 
possible to reference the resource, is responsible for maintaining the 
semantic validity of the mapping over time (i.e., ensuring that the 
membership function does not change).

The OWL Reference says [2]:
The built-in OWL property owl:sameAs links an individual to an 
individual. Such an owl:sameAs statement indicates that two URI 
references actually refer to the same thing: the individuals have the 
same "identity".

So I can say, hand on heart:

<http://mydomain.org/a> owl:sameAs <http://mydomain.org/b> .

but unless I have total authority over you and your minions (or at least 
some permanent binding agreement), how can I say:

<http://mydomain.org/a> owl:sameAs <http://yourdomain.org/a> .

Ok, the same problem applies to practically anything I might want to say 
about URIs for which I don't have authority, and a bit of handwaving and 
the implication that it was true when I said is enough to enable 
reasoning on names defined in different domains. However, I could be 
wrong, but it seems to me that equivalence/shared identity is a special 
case due to the way URIs work in practice, because of the notion of 
representations. If two URIs have the same identity, then surely they 
will have the same set of representations. Which, I would suggest, is 
likely to be unusual on the real web.

This has come up in the context of the Atom syndication language, where 
a suggested approach to dealing with multiple occurences of the same 
entry across the web is to use a non-resolvable URI to identify items, 
with another URI used to provide a locator for representations of the 
items. i.e. a URN and a URL (A completely non-normative description of 
this approach can be found at [3]). It does seem to be bending the 
notion of URIs somewhat (as a workaround for uncoolness),  it's saying 
that resource A has no representations except the representations of 
resource B. But perhaps there is some consistent way of expressing URI 
equivalence that can capture this.

For a use case, consider a homepage at http://example.org/1994 - the 
original owner has long gone, this doesn't resolve into anything. But 
the material is still available at http://archive.org/example.org/1994.

As it happens, it's now looking like the supposed gains for dual URN+URL 
in Atom doesn't actually bring much benefit anyhow (e.g. threading 
between entries will have to refer back to the URN, so a reliable URL 
still needs to be discovered).


[2] http://www.w3.org/TR/2004/REC-owl-ref-20040210/#sameAs-def
[3] http://diveintomark.org/archives/2004/05/28/howto-atom-id



Received on Tuesday, 22 June 2004 07:22:55 UTC