Re: SOAP, RDF and Semantic Web screenscraping... from Alex Barnell on 2000-09-08 (www-rdf-interest@w3.org from September 2000)

From: Alex Barnell <aeb99@doc.ic.ac.uk>
Date: Fri, 08 Sep 2000 23:54:45 +0100
To: www-rdf-interest@w3.org
Message-ID: <39B96E35.EE1AC3E5@doc.ic.ac.uk>

Didier PH Martin wrote:

> You have probably recognized here the CC/PP stuff. A peer to peer network
> could be useful to resolve a resource capability request. Now that you have
> the context, what are your ideas on the info exchange API between stores. we
> may try to create a chain of stores containing capability record with the
> following algorithm
> 
> a) the client request for a capability record,
> b) the server does not have it and ask to a linked store for it
> c) and so forth until one gets it, return the RDF record to the previous
> calling store... and so forth until the first requested store gets it and
> reply to the client with an RDF element.
> 
> Your ideas?
> 

The Freenet project is hoping to develop a RDF-search engine using the above
look-up
mechanism. Freenet already uses the above distributed routing algorithm to
retrieve
documents with unique URIs. For example,
freenet:CHK@jF41PhEa7kmrNLJQmsYbQadi5-cDAQ,if86xIl3SeKDjVoSnhJoMA is the URI for
DeCSS
(CHK stands for Content Hash Key. The data after CHK is the Base-64 encoding of
the
DeCSS MD5). Freenet servers will recursively request the data until it is found,
then it
is cached on every node back along the search path. Whenever a document passes
through a
node, a reference to the address of the node where the data was found is
recorded. The
routing path for document inserts and requests is chosen by finding the address
of the
reference to the key that is lexicographically closest to the search key of the
current
request (a more detailed explanation can be found at freenet.sourceforge.net)

How is this relevant to RDF discovery?

Freenet works well because popular data is mirrored widely over Freenet. The
same should
apply for an RDF look-up mechanism. Popular RDF graphs will get mirrored over
many
locations.A data-democracy would be created, where the RDF that users consider
valuable
is also the most available.

There is almost a consensus amongst the Freenet developers that the RDF model
should be
used for these distributed metadata queries. The main problem to overcome is how
to route
metadata search requests efficiently. Most ideas to data have been based on
fuzzy-searching,
where you choose the address of the reference to an RDF graph that has the
highest
fuzzy-match score for the search criteria (this can be calculated by using
approximate
matching on string literals). There is a Freenet simulator written in Java named
Serapis,
which can be used to simulate the efficiency of search algorithms.

The goal of a future Freenet search engine will be to make RDF uncensorable, and
the
authoring and searching of RDF anonymous. The data caching in these distributed
look-ups
goes a long way to acheive these goals. It would be great if Freenet could be
compatible
with other RDF search networks, to allow RDF to migrate across networks.

Received on Friday, 8 September 2000 19:00:47 UTC