Resource ambiguity [was Re: Can "http://danbri.org" and "http://danbri.org/" URIs represent different things?]

On Thu, 2009-07-02 at 15:50 -0500, Pat Hayes wrote:
> On Jul 2, 2009, at 3:42 PM, Alan Ruttenberg wrote:
> 
> > On Tue, Jun 30, 2009 at 8:54 PM, Dan Brickley<danbri@danbri.org>  
> > wrote:
> >> Hello TAG,
> >>
> >> Talking with some SW folk about OpenID, and whether my "me-the- 
> >> person" URI
> >> could be practically usable as my OpenID, I came up with this  
> >> corner-case:
> >>
> >> Could http://danbri.org be a URI for "me the person", and http://danbri.org/
> >> be a document about me (and also serve as my OpenID)?
> >>
> >> As I understand HTTP, any client must request something, so the  
> >> former isn't
> >> directly de-referencable. The client has to decide to ask for / from
> >> danbri.org instead. But they're still different URIs, aren't they?
> >>
> >> Is...
> >>
> >> <Person  xmlns:foaf="http://xmlns.com/foaf/0.1"/
> >>         rdf:about="http://danbri.org">
> >>  <openid>
> >>    <Document rdf:about="http://danbri.org/"/>
> >>  </openid>
> >> </Person>
> >>
> >> ...at all feasible? I guess it depends on how exactly we think  
> >> about the
> >> "add a / to the end" step...
> >
> >
> >> From an RDF point of view the URI strings are different means that
> > they can denote different things.
> >
> > I guess the question I have about this is: Why be so "clever"?
> 
> I think I can answer that. Because people are. In fact, people use the  
> same name for a person and the person's website and the person's name,  
> etc., often without even noticing that they are doing it, and  
> certainly without falling into instant incoherence or having their  
> brains catch fire. But our inference engines can't handle this kind of  
> ambiguity, at present. So it would be handy if a notational convention  
> could be adopted that allowed the dumb machinery to keep its prissy  
> distinctions distinct, while allowing human readers to be sloppy  
> without even noticing that they are being sloppy. This idea is an  
> elegant step in that direction, if it can be made to work.

I agree that a *clear* notational convention would be helpful.   But I
do *not* think that using subtly different URIs to distinguish between
Dan and his web page is a wise design choice.   It is just inviting
confusion and error.  The likely result is that *both* URIs would be
used for both purposes, without the intended distinction.  I think it
would be better to "ambiguously" use the same URI for both than to use
two URIs that differ so subtly that even the HTTP protocol cannot
distinguish them.

The semantic web community needs to learn to deal with resource
ambiguity, and this is a good example.  The ambiguity that is created
when the same URI is used both to denote Dan Brickley the person and
Dan's web page is not fundamentally different from ambiguity that is
inescapable in the semantic web world at large.  (See Pat Hayes' "In
Defence of Ambiguity":
http://www.ibiblio.org/hhalpin/irw2006/presentations/HayesSlides.pdf )

The essential problem is that ambiguity is in the eye of the beholder.
Or perhaps I should say: ambiguity is in the *application* of the
beholder.  What one application views as a single resource having
multiple aspects -- and hence having a single URI to denote -- another
application requiring finer distinctions may view as multiple resources,
each deserving of its own URI.

This is exactly what happens when Mark Baker uses http://markbaker.ca/
to denote both himself and his blog.  Some applications will see no
ambiguity in such usage because they don't need to distinguish between
Mark and his blog.  Others will see this as an ambiguity that causes
problems.  And still others will recognize the ambiguity, but will be
able to distinguish between cases where the URI is used to denote the
person and those where it denotes the blog.  This process of "splitting"
the identity of an ambiguous resource is described in
http://dbooth.org/2007/splitting/

There is no escaping this problem.  No matter how fine the distinctions
or how carefully a resource is described there will always be
applications that require finer distinctions.  The best we can do is ask
people to consider the future users of the URIs they mint, and try to
make choices that will best benefit the range of applications they wish
to support, minting distinct URIs if a single URI is likely to cause
confusion.

Finally, there is a tension between precision and reusability.  The more
precisely a resource is described -- the more tightly constrained it is
-- the less *reusable* it is.  For example, in figure 2 of
http://dbooth.org/2009/denotation/#rdfsem
a certain set of interpretations are possible.  If additional
constraints are added, this set of possible interpretations can only
shrink.  As two RDF graphs are merged, the resulting set of possible
interpretations is limited to the intersection of the sets of
interpretations possible for each graph individually.  If the
intersection is empty, the graphs are incompatible: they cannot be used
together without first "splitting" the ambiguous resource.
This issue is further described here:
http://lists.w3.org/Archives/Public/www-tag/2009Jun/0087.html

This does *not* mean that it is okay to be sloppy in our descriptions.
Rather, it means we must accept the inherent limitations and trade-offs
involved when dealing with resource identity, we should not expect
someone else's resource description to always match our own needs, and
we should learn how to work around the ambiguity when we still want to
use their data.


-- 
David Booth, Ph.D.
Cleveland Clinic (contractor)

Opinions expressed herein are those of the author and do not necessarily
reflect those of Cleveland Clinic.

Received on Monday, 6 July 2009 01:52:04 UTC