Re: Meaning of URIRefs from Sandro Hawke on 2002-10-25 (www-rdf-interest@w3.org from October 2002)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 25 Oct 2002 16:51:30 -0400
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
cc: phayes@ai.uwf.edu, www-rdf-interest@w3.org
Message-Id: <200210252051.g9PKpUQ25871@wadimousa.hawke.org>
> From: Sandro Hawke <sandro@w3.org>
> Subject: Re: Meaning of URIRefs 
> Date: Fri, 25 Oct 2002 15:11:49 -0400
> 
> > > >The way around these problems is to split definitional content and
> > > >general content into different documents.
> > > 
> > > The trouble with this kind of reply is that it requires all web users 
> > > to obey unspoken and hard-to-define rules of good behavior.  Note 
> > > that you, Sandro, were at pains to explain in your earlier message 
> > > that 'definitional' didn't have any sharp meaning. A very good point; 
> > > but then it is hardly reasonable to immediately require that all web 
> > > users segregate their content into documents according to whether or 
> > > not they satisfy this meaningless distinction.
> > >
> > > BUt more seriously, we can SAY things like this all we want, but 
> > > people will not in fact do it. Are you going to try to tell, say, 
> > > Nokia, that they must segregate all their RDF content into different 
> > > documents according to whether or not they are considered 
> > > 'definitional'? They would (correctly) laugh at you.
> > 
> > (There's some lag in this message chain, so perhaps I've already
> > addressed this to your satisfaction.  But I'll err on the verbose
> > side.)
> > 
> > I can suggest to Nokia that if they segregate their data carefully,
> > they will be facilitating use and reuse of the data.  It's not unlike
> > HTML web sites, where there is ongoing tension between
> > lazy/short-sited design and good design, where good design has all
> > sort of benefits which may not be immediately obvious.
> 
> 
> > The segregation into definitional and non-definitional triples is
> > certainly an art, but perhaps it's not as black as it seems.  DAML
> > users don't seem to have much trouble distinguishing between ontologies
> > and instance data.  
> 
> The distinction between ontologies and instance data is very different from
> the definitional vs non-definitional distinction.  In any case, drawing the
> line between ontologies and instance data is not easy at all.
> 
> > I haven't seen a lot of URIRefs pointing into
> > instance data from other instance data, and I'm not sure what exactly
> > people intend when they do that.  
> 
> Well, all my examples about George-Bush-the-lesser require the use of a
> well-known identifier for that George Bush.  It sure seems to me that these
> are all examples of pointers from instance data to other instance data.
> Just about any appliction of the semantic web that I can think of, from
> travel planning to phone routing to fully automated assistants, require a
> network of instance data on multiple sites with many references betwee
> them.

I'm not forbidding multiple instance data files from using the same
URIRef for something.  I'm just trying to help make sure they're using
it for the same thing by providing a common bit of information about
it which they all share, which they all agree to in order to use the
term.

Isn't there a social convention that you don't use a word except in
accordance with its generally-accepted meaning?  And if you don't know
what meaning is generally accepted, then you don't use the word.  (Of
course there are exceptions, ways to indicate use-without-condfidence,
etc, but I believe this convention is the norm.)

On the Semantic Web, in the next few years, how is one to know the
generally-accepted meanings of terms?   

The proposal I'm advocating is that the we should consider the
"generally-accepted meaning" of a URI-Ref in RDF to be exactly the
meaning that is conveyed by the text at the URI.  

I'm not sure what you'd consider the generally-accepted meaning of
"George-Bush-the-lesser".  I guess something like "the one of two
famous people named 'George Bush' who is in some overwhelming sense
less than the other."  It's a wonderful example of an English term
which has the same kind of tainted definition you're worried about my
proposal somehow forcing you to use.

Why is it not sufficient that my proposal gives you freedom to pick
whatever taint you want, trading it off against the ease and/or
ability of recognizing when you're talking about the same thing as
what someone else defines differently?    I don't see natural language
doing any better, and that makes me think we can't do any better.


> > Does someone have some real uses cases where this segregation does NOT
> > seem like good information architecture?  
> 
> Which segregation?  It certainly does not always make sense to require that
> ontologies do not contain any instance data.  To pick a slightly outdated
> example, how can you refer to ``friends of Bill'' without mentioning Bill
> Clinton?  Yet ``friends of Bill'' sure seems ontological to me, and,
> although this claim might be disputed by many in the far right, Bill
> Clinton is an individual, not a class or category.
> 
> It also is extraordinarily difficult to make a distinction between
> definitional and non-definitional information.  For example, it is
> definitional that tigers are mammals?  It is definitional that tigers come
> from India?  It is definitional that tigers are an endangered species?  It
> is definitional that tigers are a symbol of royalty?  Is it definitional
> that tigers are to be revered?  Is it definitional that Tigger is a tiger?

When I put myself in the shoes of someone actually trying to publish
an ontology, and concerned that I don't want to say anything which
might offend or otherwise put off possible users, most of these issues
become clear.  For your specific questions, the answers depend largely
on what kind of ontology I'm trying to publish.  Any sort of general
definition of tiger, except perhaps involving reference to the
extensive scientific literature about Panthera tigris, is of course
daunting.

My understanding of the practice of Ontology (correct me if I'm wrong)
is that it allows you to bite off this kind of problem in a chunk size
suitable to your application.  Whatever features serve to distinguish
tigers in your applications are the ones you should put you
definitions.  No more, and no less.  If those features work for other
people, they can happily use your ontology and your URIRefs.  If they
don't they'll have to develop something different.

Down the road, if someone's interested in both tiger ontologies, they
can feed them both to their DL reasoner and ask it which tiger
instances belong to a:Tiger and b:Tiger, based on the various instance
features available.   

> >    -- sandro
> 
> Peter F. Patel-Schneider
> Bell Labs Research

  -- sandro
Received on Friday, 25 October 2002 16:52:13 UTC