W3C home > Mailing lists > Public > www-rdf-interest@w3.org > April 2002

RE: Explicit Disambiguation Via RDF bNodes, more Process

From: Bill de hÓra <dehora@eircom.net>
Date: Sat, 27 Apr 2002 00:30:07 +0100
To: "'Joshua Allen'" <joshuaa@microsoft.com>, "'Danny Ayers'" <danny666@virgilio.it>, "'Sandro Hawke'" <sandro@w3.org>, <www-rdf-interest@w3.org>
Message-ID: <002601c1ed7a$59dbbad0$887ba8c0@mitchum>
Hash: SHA1

> -----Original Message-----
> From: www-rdf-interest-request@w3.org 
> [mailto:www-rdf-interest-request@w3.org] On Behalf Of Joshua
> Allen  
> This is a poor argument.  Words mean things.  Just because 
> there are rare cases where a word will be used with 
> connotation that is opposite its normal connotation does not 
> mean that words are meaningless.  Are words "too far gone"?
> I don't have to prove that ambiguity never existed to assert 
> that gratuitous ambiguity is a stupid strategy.

Speaking for myself, I'm not trying to be gratuitous. You're
arguing from an extreme point ("rare cases where a word will be
used with [a] connotation that is opposite its normal
connotation"), but really we're on a sliding scale. Away from the
extremes, people get their meanings mixed every day (i.e., "violent
agreement"). I don't see how we expect things to be different on a
web sprinkled with RDF.

But I simply can not take RDF assertions off the web and merge them
with my local data unless I have some confidence that the each URI,
http or otherwise is referring to only one thing. But in an open
system I can never be certain. We need to get over this.

What I can do is calculate a probability that a URI seen in two
different data sets refers to the same thing (that's my instinctive
reaction to this problem). Or I can possibly determine sameness by
comparing the properties hanging off the URIs, or doing some type
inference, which is what I understand Danny's suggesting. 

This isn't a new problem, information retrieval types have been
dealing with it for decades; the best efforts there are invariably
statistical or probabilistic; they deal with natural language,
which is another open system (in the sense anyone can say anything
about anything and a natural language doesn't have a centralized
namespace, Newspeak not withstanding). Typically as I understand
it, logic based approaches are allowed the assumption that
identifiers identify uniquely, but that's very much a result of
being in a "closed system" (read: I control the namespace).

Humour some paranoia for a moment. One of the important features of
the semantic web and its predication on openness is the allowance
than anyone can say anything can say anything. If this semantic web
stuff becomes important or mission critical, then I have no problem
imagining a meta-script-kiddies or worse injecting descriptive junk
about a URL into the system, that would confuse which anything a
URL stood for. How would I tell the difference between stupid and

> > look at the reality of the URI rather than trying to change it
> > or 
> invent a
> Yes.  The reality is that URIs which use http: refer to 
> something that uses HTTP -- a web page.

The reality is that people are using http: scheme URIs for non HTTP
things such as namespaces and proper nouns. You and I can call them
stupid or gratuitous until you're blue in the face, but that won't
stop anyone. A number of people are gambling on using URLs this way
perhaps because they foresee smart machines making GET requests
against them for further information about the thing; they're
keeping an option open that the semantic web will have URLs as
proper nouns but those URL nouns will also be bound to a web
accessible thesaurus based on logic and rules of inference. Really,
they're just adding to the confusion. I can look up a word in
dictionary but if I want to understand what someone means by it, I
should ask them directly.

Bill de hÓra

Version: PGP 7.0.4

Received on Friday, 26 April 2002 19:37:33 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:44:35 UTC