W3C home > Mailing lists > Public > www-rdf-interest@w3.org > January 2004

Re: Can hash URI description lookups be made to scale?

From: Alexander Löser <aloeser@cs.tu-berlin.de>
Date: Fri, 30 Jan 2004 11:59:01 +0100
Message-ID: <401A38F5.927F172@cs.tu-berlin.de>
To: pdawes@users.sourceforge.net
Cc: www-rdf-interest@w3.org

Phil,
an interesting way for a scalable storage of URIs is to hash them in a
distributed hash table (DHT), such as CHORD, CAN, Pastry, TAPASTRY,
PGRID... However, in 'traditional' DHT networks hash values are hashed
consistently and thus are equally distributed among the different nodes.
To avoid the problem storing other peoples person description  you can
use so called "hash key to query references" for DHT's. Consider the
following example profile:

<URI="ABC">
Some RDF Description describing the person
</>


STORING THIS PROFILE:
First, this profile is stored locally at your RDF store, e.g. Jena or
JOSEKI. Second, the primary key of the profile, the URI,  is stored also
in the DHT:
$AABB1234=HASH(<URI="ABC">) As the object for the key you store a
reference to your local Jena repository, such as:$AABB1234   ->
Http://123.34.123.23/query?=qSELECT_persons_WHERE_uri="ABC"


LOOKUP THIS PROFILE:
When another person initiates a lookup for "URI="ABC" first the hash key
for the lookup is computed: HASH(<URI="ABC">) = $AABB1234. Then this key
is looked up in the DHT (by the way,  each lookup in a DHT costs  0(log
N) messages).    The object value for this key is
Http://123.34.123.23/query?=q SELECT_WHERE_URI =URI="ABC". Thus the
query will be routed to your local repository.

This approach has two  advantage,  each repository of people is
autonomous from other repositories. Only the administrator of the local
repository can decide, which profiles should be inserted or not. Changes
on person profiles reflect only the local repository. On the other hand,
only the keys of the repository are published in the 'global' DHT.

Alex


Phil Dawes wrote:

> Hi All,
>
> Apologies if this is a FAQ.
>
> At work we have an ldap directory containing a few 10000s of employee
> entries. I've recently installed a servlet to provide rdf foaf
> descriptions of these people on demand.  The URI of each person is
> currently of the form e.g. http://example.com/2004/01/people#dawesp
>
> I prefer hash URIs on the basis that they look nicer to me. However,
> I'd like a to provide a description lookup service at the end of the
> URI.
>
> I can see how to do this with URIQA MGET. I can also see how to do
> this easily with HTTP GET on slash uris - a simple "303 see other" to
> the servlet to generate the appropriate RDF description.
>
> But with HTTP GET on hash URIs (if I understand correctly) the hash
> doesn't necessarily get to the server, so what's the best way to do
> this?  I don't really want to serve a page with everyone on it
> (~gigabyte of data)
> Does this prohibit hash uris for this type of service?
>
> Any help would be much appreciated,
>
> Many thanks,
>
> Phil



--
___________________________________________________________

  M.Sc., Dipl. Wi.-Inf. Alexander Löser
  Technische Universitaet Berlin Fakultaet IV - CIS
  bmb+f-Projekt: "New Economy, Neue Medien in der Bildung"
  hp: http://cis.cs.tu-berlin.de/~aloeser/
  office: +49- 30-314-25551
  fax   : +49- 30-314-21601
___________________________________________________________
Received on Friday, 30 January 2004 06:56:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:52:04 GMT