W3C home > Mailing lists > Public > semantic-web@w3.org > June 2007

Re: homonym URIs (Re: What if an URI also is a URL)

From: Pierre-Antoine Champin <swlists-040405@champin.net>
Date: Wed, 13 Jun 2007 15:21:14 +0200
Message-ID: <466FEF4A.6060205@champin.net>
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Pat Hayes <phayes@ihmc.us>, Sandro Hawke <sandro@w3.org>, semantic-web@w3.org

I've been following this thread with interest, and I was quite agreeing
with Sandro's position, but Pat's position made me think a lot... and I
quite like it (though Richard also has a point against it, IMHO).

I think (as Pat, if I read him correctly) that punning/overloading can
not be avoided. I would add that it can be deliberate, for practical
reasons (e.g. e-mail adress / person, predicate / function), but it can
also be *unintentional*. Let me explain :

we keep using the same word for slightly different things (e.g. a city
as an administrative entity or as a populated location), as long as the
difference between them is not relevant to us. The same will be true of
URIs that we will create and put in RDF. We can not expect everybody on
the web to require the same level of detail on every part of the world
about which they make RDF statements.

So I'm getting more and more convinced that we cannot afford to rule out
ambiguity as "weak design", and should devise means to cope with it.

What I'm not sure is whether deliberate and unintentional overloading
can benefit from the same means -- they seem to be opposite in nature
(on the first case, the objects are too different to require different
words, in the second case, they are too similar), but also come from the
same requirement of efficiency / practicality.

Another problem is that the relation between two objects denoted by the
same term may not be functional, which leads us to the problem raised by
Richard. An intuition is that owl:sameAs may be too strong a statement
in a context where URI can be ambiguous.

would

<http://richard.cyganiak.de/>
   :samePerson <http://ontoworld.org/wiki/Richard_Cyganiak> .

be a workaround ? (just a thought)

  pa


Richard Cyganiak a écrit :
> 
> 
> On 12 Jun 2007, at 22:07, Pat Hayes wrote:
>>> To pick up just one point: Where do you draw the line between harmful
>>> punning and efficiency-increasing punning? Any rules of thumb for
>>> when it is OK? Why is it OK to pun with email addresses, but not with
>>> wives?
>>
>> Because people and email addresses are so different that almost
>> nothing you ever want to say about or do to one is ever said about or
>> done to the other. If you email to PatHayes, you must have meant to
>> PatHayes' email address. If you assert that my email address has two
>> children, you must have meant me. With two people (or two mailboxes)
>> however, things are different. There really is no way to tell then
>> which is meant: you can't locally disambiguate the punning.
> 
> Here are two web pages about me:
> 
>    <http://richard.cyganiak.de/>
>    <http://ontoworld.org/wiki/Richard_Cyganiak>
> 
> One is in German, the other in English:
> 
>    <http://richard.cyganiak.de/> dc:language "de" .
>    <http://ontoworld.org/wiki/Richard_Cyganiak> dc:language "en" .
> 
> You say it's OK to use a web page URL to denote the person it's about, so:
> 
>    <http://richard.cyganiak.de/> a foaf:Person .
>    <http://ontoworld.org/wiki/Richard_Cyganiak> a foaf:Person .
> 
> Both clearly denote the same person, so we can confidently state:
> 
>    <http://richard.cyganiak.de/>
>       owl:sameAs <http://ontoworld.org/wiki/Richard_Cyganiak> .
> 
> This allows us to conclude:
> 
>    <http://richard.cyganiak.de/> dc:language "de" .
>    <http://richard.cyganiak.de/> dc:language "en" .
> 
> Which is obviously wrong. So what did I do?
> 
> 1. I used the DC, FOAF, and OWL vocabulary, which are used in exactly
> this way all over the Semantic Web.
> 2. I used an inference rule sanctioned by the OWL specifications, which
> is used all over the Semantic Web.
> 3. I used your claim that punning is OK.
> 
> And I arrived at an incorrect conclusion. Why, Pat?
> 
>> So the rule of thumb, which can be made operationally quite precise,
>> is that punning is OK if (there is a very high probability that) there
>> is enough contextual information available at the point of use to
>> figure out which of the various meanings is intended.
> 
> I think on the open Semantic Web, there is a very high probability that
> your URI will end up in places where that contextual information is not
> available and thus the information consumer cannot figure out which of
> the various meanings was intended. It seems to me that, following your
> own guideline, we'd have to conclude that punning on the Semantic Web is
> almost never OK.
> 
> Richard
> 
> 
>>
>> Pat
>>
>>>
>>> Cheers,
>>> Richard
>>>
>>>
>>>> But the appropriate thing to say is not to denigrate punning, but to
>>>> explain what is wrong with doing it badly.
>>>>
>>>>>
>>>>>>  And what about a URI
>>>>>  > that I own and wish it to denote, say, the planet
>>>>>>  Venus, or my pet cat? What do I do, to attach the
>>>>>>  URI to my intended referent for it?
>>>>>
>>>>> You publish a document (an ontology) so it's available through that
>>>>> URI.
>>>>> If it's a hash URI, you publish the ontology at the non-hash version.
>>>>> If it's a slash URI, you publish the ontology at the far end of a 303
>>>>> redirect.  And you content-negotiate HTML and RDF.
>>>>>
>>>>> So when users paste that URI into their browser, they get the official
>>>>> documentation about it.
>>>>
>>>> None of that attaches a URI to my cat (though see below)
>>>>
>>>>> And when RDF software dereferences that URI, it gets some logical
>>>>> formulas which should be understood (like the HTML) to be asserted
>>>>> by the
>>>>> URI's owner/host/publisher.  Those formulas constrain the possible
>>>>> meanings of that URI, relative to other URIs.
>>>>
>>>> Neither does any of that (and in this case, I can *prove* it, using
>>>> Herbrand's theorem.)
>>>>
>>>>>  They can't nail a URI to
>>>>> Venus
>>>>
>>>> Quite. In fact, none of this can nail a URI to ANYTHING other than
>>>> something accessible using a transfer protocol.
>>>>
>>>>> , but they can use other ontologies to provide useful (and possibly
>>>>> very constraining) information, like that it's an astronomical body
>>>>> with
>>>>> a mass of about 5e+24kg.
>>>>
>>>> You are begging the question. Suppose an ontology asserts
>>>>
>>>> ex:Venus rdf:type ex:AstronomicalBody .
>>>>
>>>> Now, what ties that object URI to the actual concept of being an
>>>> astronomical body? And so on for all the other URIs in all the other
>>>> OWL/RDF ontologies. The best you can do is to appeal to the power of
>>>> model theory to sufficiently constrain the interpretations of the
>>>> entire global Web of formalized information. But that argument from
>>>> Herbrand's theorem (basically, if it has a model at all then it has
>>>> one made entirely of symbols) applies just as well no matter how
>>>> large the ontology is.
>>>>
>>>> The only way out of this is to somewhere appeal to a use of the
>>>> symbolic names - in this case, the IRIs or URIrefs - outside the
>>>> formalism itself, a use that somehow 'anchors' or 'grounds'  them to
>>>> the real world they are supposed to refer to. If we all assume that
>>>> English words are so grounded (not a bad assumption) then this can
>>>> be done in principle by using the URI in English sentences or to
>>>> other kinds of representation which are widely accepted as
>>>> real-world identifiers, like SS numbers or facial images. I did all
>>>> three in
>>>>
>>>> http://www.ihmc.us/users/phayes/PatHayes.html
>>>>
>>>> If the TAG said this somewhere, and recommended how to do it, that
>>>> would be great.
>>>>
>>>>>
>>>>> My advice here is, I confess, not widely followed.  But I hear more
>>>>> and
>>>>> more people converging on the idea that this is both practical and
>>>>> likely to be sufficiently effective.
>>>>
>>>> I agree. Still, its important to describe it properly. It doesn't
>>>> mean that URIs have a unique denotation.
>>>>
>>>>>
>>>>>>  The point surely is that URIs used to refer (not
>>>>>>  as in HTTP, but as in OWL) do *not* have a
>>>>>>  standardized meaning. Standards are certainly a
>>>>>>  chore to create, but they only go so far. OWL
>>>>>>  defines the meanings of the OWL namespace, but it
>>>>>>  does not define the meanings of the FOAF
>>>>>>  vocabulary,
>>>>>
>>>>> No, that's up to the owner(s) of the FOAF terms.
>>>>>
>>>>>>  or the URIrefs used in, say,
>>>>>>  ontologies published by the NIH or by JPL.
>>>>>
>>>>> And that's up to the NIH and JPL, respectively.
>>>>
>>>> I understand that. I was reacting to Tim's comments, which seemed to
>>>> suggest that all this should be determined by standards-setting groups.
>>>>
>>>>>
>>>>>>  The
>>>>>>  only way those meanings can be specified is by
>>>>>>  writing ontologies: and finite ontologies do not
>>>>>>  - cannot possibly - nail down referents
>>>>>>  *uniquely*.
>>>>>
>>>>> Ah -- there we go.  There must be a long history of this subject in
>>>>> philosophy.  Can things ever be nailed down uniquely?  I haven't a
>>>>> clue.
>>>>> But that's the wrong question.
>>>>
>>>> Surely this is exactly the question. I didn't raise the issue, Tim
>>>> did. There is a claim, often repeated and sometimes cited as
>>>> doctrine, that a URI *must* identify a *single* referent. To do this
>>>> requires that things are nailed down uniquely (isn't that EXACTLY
>>>> what it says?) but they can't be.
>>>>
>>>>>  In this thread, I don't think we're
>>>>> talking about whether we can really be sure what we mean when we say
>>>>> such a URI denotes Venus.
>>>>
>>>> Well then don't SAY that is what you are concerned with, for
>>>> goodness's sake. That is what is implied by "the URI for Venus has a
>>>> unique denotation".
>>>>
>>>>>  Instead, we're talking about whether it's a
>>>>> good practice to use a single URI to denote clearly distinct things
>>>>
>>>> Aaaaargh. What do you think is 'clearly' distinct?
>>>>
>>>> The second rock from the sun might be a continuant or an occurrent.
>>>> Those are as clearly distinct as a rock and a Roman goddess. I know
>>>> people are a lot more familiar with the second kind of clearly
>>>> distinct, but ontologies aren't people. And the first kind of
>>>> difference is more important, if anything, than the second, for an
>>>> ontology. The second kind of muddle is easily resolved. The first
>>>> kind can be fatal.
>>>>
>>>>> ,
>>>>> such as:
>>>>>    (1) the second rock from the sun
>>>>>    (2) the Roman goddess of love
>>>>>    (3) a star tennis player
>>>>>    (4) ... etc
>>>>> The term "ambiguity" covers both these issues, but we don't need to
>>>>> combine them.
>>>>
>>>> Well, you tell me how to distinguish them, then.
>>>>
>>>>>  The first is a kind of imprecision, a fuzziness
>>>>
>>>> No, its worse than that. Its like the distinction between an object
>>>> and a process. Fuzziness/imprecision is what gives you the 'Everest'
>>>> kind of examples.
>>>>
>>>>> , while
>>>>> the second is the re-use of a word for a second meaning, a homonym.
>>>>> (Homonyms seem to be called "overloading" in computer programming.)
>>>>>
>>>>> I think we know how to work with homonyms, but since we're
>>>>> engineering a
>>>>> new system, it seems like a good design decision to forbid them,
>>>>> doesn't
>>>>> it?
>>>>
>>>> Well, actually, no. Overloading is widely used for good engineering
>>>> reasons. And on an open system like the Web, we arent going to be
>>>> able to prevent it happening, so we will need to have methods of
>>>> dealing with it. Once those are deployed, one might as well take
>>>> advantage of them. Making grand statements about what should be done
>>>> seems to me like trying to tell evolution what it ought to be doing.
>>>>
>>>> Pat
>>>> -- 
>>>> ---------------------------------------------------------------------
>>>> IHMC        (850)434 8903 or (650)494 3973   home
>>>> 40 South Alcaniz St.    (850)202 4416   office
>>>> Pensacola            (850)202 4440   fax
>>>> FL 32502            (850)291 0667    cell
>>>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>
>>
>> -- 
>> ---------------------------------------------------------------------
>> IHMC        (850)434 8903 or (650)494 3973   home
>> 40 South Alcaniz St.    (850)202 4416   office
>> Pensacola            (850)202 4440   fax
>> FL 32502            (850)291 0667    cell
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>
>>
> 
> 
Received on Wednesday, 13 June 2007 13:21:24 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:41:57 UTC