Re: Explicit Disambiguation Via RDF bNodes, more Process from Murray Spork on 2002-04-29 (www-rdf-interest@w3.org from April 2002)

From: Murray Spork <m.spork@qut.edu.au>
Date: Mon, 29 Apr 2002 10:59:58 +1000
To: www-rdf-interest@w3.org
Message-id: <3CCC9B0E.5040704@qut.edu.au>

Hi all - hope you don't mind me jumping in here.

Bill de hÓra wrote:

[...]

 > Speaking for myself, I'm not trying to be gratuitous. You're
 > arguing from an extreme point ("rare cases where a word will be
 > used with [a] connotation that is opposite its normal
 > connotation"), but really we're on a sliding scale. Away from the
 > extremes, people get their meanings mixed every day (i.e., "violent
 > agreement"). I don't see how we expect things to be different on a
 > web sprinkled with RDF.
 >
 > But I simply can not take RDF assertions off the web and merge them
 > with my local data unless I have some confidence that the each URI,
 > http or otherwise is referring to only one thing. But in an open
 > system I can never be certain. We need to get over this.

I agree. Joshua's phrase "gratuitous ambiguity" indicates that he
thinks there is an attempt to "build in" ambiguity into the model.
I'd rather think of it as just an acceptance that representation of 
identity is in reality ambiguous - attempting to abstract away that 
ambiguity is not likely to succeed IMO.

 > What I can do is calculate a probability that a URI seen in two
 > different data sets refers to the same thing (that's my instinctive
 > reaction to this problem). Or I can possibly determine sameness by
 > comparing the properties hanging off the URIs, or doing some type
 > inference, which is what I understand Danny's suggesting.

Again I agree - is this really a problem that needs solving in RDF? - or
is it (as I suspect) a problem to be solved with appropriate tools?

If doing some sort of probabilistic plausibility analysis to resolve
ambiguous identifiers is considered "vodoo" - then the idea of a model
where no possibility for ambiguous identifiers exists is (IMO) so much
more far-fetched.  If we can't write the tools to substantially solve
this problem then IMO the SW is a non-starter and no attempt to engineer
out ambiguity can save it.

My suggestion would be that in the absence of conflicts 
(inconsistencies) we assume URIs identify an unambiguous thing.
But if in merging rdf graphs conflicts arise (for example where 2 
objects are ostensibly the same object (they have the same URI) but are 
used in statements that imply incompatible typing)- then we resolve the 
inconsistencies at that point - using the appropriate tools and 
techniques (such as that "qua" technique suggested by Danny Ayers).

This seems to me to be a practical and workable solution that hits the 
appropriate 80/20 point.

Cheers,

--
Murray Spork
School of Information Systems, Faculty of Information Technology
The Redcone Project
http://redcone.gbst.com/
Queensland University of Technology, Brisbane, Australia
Phone: +61-7-3864-4246
Email: m.spork@qut.edu.au

Received on Sunday, 28 April 2002 20:56:49 UTC