URI Identity Crisis Redux [was: Re: Representing things RE: RDF and speech acts] from jon@hackcraft.net on 2003-11-26 (www-rdf-interest@w3.org from November 2003)

From: <jon@hackcraft.net>
Date: Wed, 26 Nov 2003 12:47:19 +0000
To: Charles McCathieNevile <charles@w3.org>
Cc: "www-rdf-interest@w3.org" <www-rdf-interest@w3.org>
Message-ID: <1069850839.3fc4a0d77acfd@82.195.128.192>
Quoting Charles McCathieNevile <charles@w3.org>:

> I hope this isn't much about angels dancing on pinheads.

Oh you can certainly get to there from here! I'm trying to avoid that though. 
(All I remember from studying philosophy is that tautology is good, all I 
remember from studying English is that tautology is bad; so much for a liberal 
arts education :)

 I realise it isn't
> a
> new topic. Maybe the Wiki is a better place for continuing?

I'd prefer to formally split the thread if you don't mind; I tend to forget 
about wiki's I'm engaged in, whereas mails stay marked "unread" until I read or 
delete them.

> >> actually FOAF doesn't use the same mechanism as WordNet.
> >>
> >> My understanding of best practice is that a bare URI will often be
> >> understood
> >> to refer to the thing that gets returned - i.e. the page.
> >
> >That couldn't possibly work since the thing that gets returned (the page,
> >although some representations wouldn't be called "pages") depends on
> factors
> >other than the URI.
> 
> "The page" is indeed a slightly nebulous concept in a world of content
> negotiation and so on. The point is that overloading the possible things
> that
> are actually at http://example.com/apage.html (by using it in an http GET
> or POST) with what that content might be talking about is adding to the
> confusion, whereas the RDF spec clearly defines the meaning of
> http://www.example.com/someTerm#foo as being the thing identified by the RDF
> version of that page (OK so that isn't a watertight definition either, but
> it
> reduces the problem a bit).

Yes, but what is "the thing identified by the RDF version of that page"? Is it 
a page? Could it be something else?

> Otherwise we need to have some way of talking about "the resource at this
> URI", and since lots of RDF stuff is about Web resources, it seems like an
> unnecessary complication.

Well is a resource "at" a URI or identified by it? Just at the level of normal 
dictionary definitions of the words the acronyms expand to "at" seems more 
appropriate with "URL" than with "URI" (and completely inappropriate 
with "URN").

[snip]
> >There is nothing in this to say that the node has to be blank, the following
> is
> >valid:
> >
> ><rdf:RDF
> >  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> >  xmlns="http://xmlns.com/foaf/0.1/">
> >  <Person rdf:about="http://www.hackcraft.net/jon/">
> >    <mbox rdf:resource="mailto:jon@hackcraft.net"/>
> >  </Person>
> ></rdf:RDF>
> 
> The "spec" says
> 
>   The foaf:Person class represents people. Something is a  foaf:Person if it
>   is a person. We don't nitpic about whether they're  alive, dead, real, or
>   imaginary. The foaf:Person class is a sub-class of the  foaf:Agent class,
>   since all people are considered 'agents' in FOAF.
>   -- http://xmlns.com/foaf/0.1/#term_Person
> 
> I realise it is nitpicking, but since a URI is not generally understood as
> being a person, I think the example is in fact invalid FOAF.

I dispute "generally understood" here, though I wouldn't go so far as to claim 
it for my use of URIs either (though mine is by no means idiosyncratic). I also 
don't see how a node can be innately blank - any node can be given a URI.

> >This is true as long as <http://www.hackcraft.net/jon/> is a URI
> representing
> >the person that can be uses <mailto:jon@hackcraft.net> and nobody else uses
> it,
> >which is true. Further there is no way that the creators of FOAF can
> prohibit
> >me from using that URI to represent myself (it's my domain I can do what I
> >want) and they can't prohibit my using it with FOAF except through defining
> a
> >document format that is not fully-featured RDF, and even then they can't
> stop
> >me using their vocabuary in a different document format.
> 
> Actually, there is nothing to stop me using the same URI to represent a cat.
> Insofar as a URI is an opaque string, it seems hard to claim ownership of it
> - unlike when it is treated as a real thing, representing precisely
> "whatever
> is returned by treating it as a thing to GET via HTTP".

Yes, authority and URIs are another tricky matter. Even if it's resolved there 
would still be nothing to stop you using the URI to represent a cat, it would 
just mean that either
a. You are wrong.
b. I am wrong.
c. We are both wrong.
d. I am a cat. (On the Internet nobody knows you're a dog.)

However I would question the value of not allowing you to claim I was a cat - 
philosophically you have every right to claim I'm a cat (so tempting to 
paraphrase Voltaire here); in practice disallowing valid but false statements 
can easily have edge cases where one finds oneself unable to state a fact.

> Best practice seems to me to involve large chunks of figuring out ways to
> help people use less ambiguous ways to describe things...
> 
> >But the page you get from dereferencing
> <http://xmlns.com/wordnet/1.6/Love-4>
> >says that:
> >
> ><http://xmlns.com/wordnet/1.6/Love-4> <http://www.w3.org/2000/01/rdf-
> >schema#description> "a deep feeling of sexual desire and attraction;
> \"their
> >love left them indifferent to their surroundings\"; \"she was his first
> >love\"" .
> ><http://xmlns.com/wordnet/1.6/Love-4> <http://www.w3.org/2000/01/rdf-
> >schema#subClassOf> <http://xmlns.com/wordnet/1.6/Sexual_desire> .
> >
> >That is clearly a poor description of the page. It sounds closer to a
> >description of erotic love to me.
> 
> I think you are saying:
> 
> <Page rdf:about="http://xmlns.com/wordnet/1.6/Love-4">
>   <describes>
>     love*
>   </...>

No, the page at <http://xmlns.com/wordnet/1.6/Love-4> is stating, quite 
unambiguously, that <http://xmlns.com/wordnet/1.6/Love-4> is called "Love [4]", 
is a subclass of whatever <http://xmlns.com/wordnet/1.6/Sexual_desire> is (it 
goes on to give us some information about that) and can be described as:

a deep feeling of sexual desire and attraction; "their love left them 
indifferent to their surroundings"; "she was his first love"

The page isn't playing ball with your use of URIs, but with mine. Actually the 
author of that page, or at least the software that produced it is on this list. 
Dan, do you have any comment on this?

> - it is a property of the page that its content includes a description of
> what love is.
> 
> It makes more sense to me that we can say
> 
> <http://xmlns.com/wordnet/1.6/Love-4> <describes> "love".
> 
> than
> 
> <Page>
>   <availableAt http:URI="http://xmlns.com/wordnet/1.6/Love-4">
>   <describes>
>     http://xmlns.com/wordnet/1.6/Love-4
>   </...>
> 

Leaving philosophical views aside (we've stated our positions) there are three 
practical issues I have with this.

1. Pages are not what we want to talk about for the most part, in life if not 
currently on the web, but tools we use to talk about something else. I would 
rather make talking about a page that describes something more long-winded than 
talking about what it describes (especially since there is nothing in my way of 
looking at this to prevent one putting a given page on the same level as 
everything else - so talking about a page need not be convoluted at all). I 
think this is going to become increasingly true as we progress; that we are 
going to move away from RDF "metadata" about pre-existing "data" and towards 
data that may or may not be "meta-".
Even in the case where "pages" are our primary area of interest we would not 
want a convoluted way to refer to those non-pages (people, organisations, 
physical locations, subject matter) that have a relationship with the page 
(authorship, publication, etc.).

The second is that forcing nodes to be blank, as you are doing with any node 
that is the concept of love, makes recognising identity harder. FOAF was lucky 
in that it could make use of the common existence of single-user mailboxes 
amongst people who are likely to want to use FOAF to it make reasonably easy to 
fold in those blank nodes that are actually the same node. This is going to be 
much trickier with the concept of love (arguably that should be the case with 
love, but if I'm trying to avoid philosophy about URIs I'm definitely going to 
avoid philosophy concerning love).

The third is that even if we take URIs as identifying pages we are still stuck 
with the issues related to the theoretically limitless (and in practice often 
more than one, which is enough to raise practical issues) variety of pages we 
might get, even with just using GET. We are forced towards an "abstract page" 
notion, where we can only reasonably talk about the lowest common denominator 
of these pages. What can we rely on being common to these pages except what 
they represent? It's this final practical point that lead to the views I have.

> Which is where I get my understanding of "best practice". But it is a
> convention - if we all do something the same, it works.

Too late! There is no agreed convention; there is no best practice. There is a 
debate, and this thread is only a small part of it.

> >The indirection is needed not if you want to identify love, but if you want
> to
> >identify the page (about which we receive no triples at all). Which reminds
> me
> >to put looking at <http://www.hackcraft.net/rep/rep.html> more seriously
> back
> >onto my to do list. Anyone want to help?
> 
> yep. I'd love to help.
> 
Well so far the only help I think you can give is to argue that I'm completely 
wrong in the endeavour. Of course maybe I am completely wrong; and therefore 
that might be the most useful thing anyone could do, but you haven't yet 
convinced me.

--
Jon Hanna
<http://www.hackcraft.net/>
*Thought provoking quote goes here*
Received on Wednesday, 26 November 2003 07:47:21 UTC