Re: Distributed querying on the semantic web from Phil Dawes on 2004-04-20 (www-rdf-interest@w3.org from April 2004)

From: Phil Dawes <pdawes@users.sf.net>
Date: Tue, 20 Apr 2004 10:35:51 +0100
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Cc: www-rdf-interest@w3.org
Message-ID: <16516.61175.442954.29175@gargle.gargle.HOWL>
Hi Peter,

[My Ramblings snipped - see rest of thread for info]

Peter F. Patel-Schneider writes:
 > 
[...]
 >
 > Well, yes, but I don't think that the scheme that you propose is workable
 > in general.  Why not, instead, use information from the document in which
 > the URI reference occured?  I would claim that this information is going to
 > be at least as appropriate as the information found by using your scheme.
 > (It may, indeed, be that the document in which the URI reference occurs
 > does point to the document that you would get to, perhaps by using an
 > owl:imports construct.  This is, to me, the usual way things would occur,
 > but I view it as extremely important to allow for other states of affairs.)
 >

Unfortunately, most of the RDF I consume doesn't contain this
contextual linkage information (or even appear in well formed
documents). Take RSS1.0 feeds for example: If there's a term I don't
know about, the RSS feed doesn't contain enough context information
for my SW agent to get me a description of that term.

I'd like a facility for doing this that doesn't rely on a centralised
search-engine.

Ideally I'd also like a facility to search based on any URI: If
somebody is selling Bicycles, I'd like my agent to be able to find
other Bicycle sellers in order to compare prices etc..


  [... more snipping ...] 

 > > 
 > > I suspect that for trust to work on any implementation of an
 > > open-world semantic web (centralised or decentralised), the authority
 > > of a statement will have to be decoupled from the location it was
 > > discovered. If that is the case, then it won't matter if you serve the
 > > statement or it comes from somewhere else.
 > 
 > Aaah, but it really does matter who serves a statement.  Perhaps not to the
 > model-theoretic semantics of the Semantic Web, but certainly the source of
 > information matters in the external social (and legal) world.  Ignoring the
 > fact that the Semantic Web is part of our imperfect and messy world is not
 > going to helpful for its widespread adoption.
 > 

I agree with you that the source of the statement is important, but
disagree in your implied definition of source.

I view this as a user-interface issue: The only reason that you
attribute the document contents to the 'owner' of the location that
served the document is that at present this is a reasonably good
guess.  Having said that, you wouldn't attempt to sue google for a
page you read from its cache.

The pages that you view on the internet come via your internet service
provider, however you don't attribute the legal authority of these
pages to your ISP. (Although people did try and do this in the early
days of the web).

Similarly, if you have a knowledge agent that hunts the internet for
information, you become less aware of the location that individual
statements were loaded from. Better facilities will exist for
automatically discovering the source of a statement than applying a
'it came from this location so it must be asserted by them' heuristic.

 > 
 > > Without such a facility on the semantic web, I struggle to see how it
 > > will be bootstrapped to deal with open queries. At present, there is
 > > no real 'web' of information to search.
 > 
 > Well, I think that there already exists the mechanism to have a truely
 > decentralised web of information, with no central authorities or
 > information sources, namely the owl:imports construct.  It is not perfect,
 > but I think that it is better than trusting in the absence of a mechanism
 > for supporting a network of trust.
 > 

True. But it doesn't allow me to do open queries. 


 > All this said, I have nothing particularly against imposing a notion of
 > authority or mandating trust relationships in particular situations.  If it
 > is indeed the case that contact information for employees is (normally, or
 > even, often) stored at particular locations, then it can be useful to
 > impose a trust relationship from outside the Semantic Web to get
 > applications to use this information.  I view this as a (partial)
 > failure of the goals of the Semantic Web, but it may be the case that this
 > is the best that can be done.
 > 
 > However, I would consider it to be much more in line with the goals of the
 > Semantic Web to instead have a document that explicitly points to these
 > employee documents to establish the trust relationship.  

My experience has been that once you start writing SW applications,
the notion of 'document' becomes clumsy and doesn't provide much
value. For example, we have lots of RDF published in documents at
work, but typically applications don't go to these documents to get
this information - they query an RDF knowledge base (e.g. sesame)
which sucks data in from these documents.

 > Even better would be
 > to have some mechanism to implicitly points to these employee documents,
 > but I do not believe that there are currently any mechanisms in the
 > Semantic Web for so doing.

That is the mechanism I am attempting to envisage. I suspect that
trust and authority is going to be a much simpler problem to solve
(simply by signing statements, and treating any unsigned statements
with suspicion) than that of decentralised information discovery to
support open queries.

The problem is that if we don't do this soon, a number of centralized
spike solutions will appear based on harvesting all the RDF in the
world and putting it in one place (e.g. 'google marketplace').

Cheers,

Phil
Received on Tuesday, 20 April 2004 06:37:16 UTC