- From: Leo Sauermann <leo@gnowsis.com>
- Date: Wed, 22 Oct 2003 18:40:44 +0200
- To: "'Sandro Hawke'" <sandro@w3.org>, <www-rdf-interest@w3.org>
Actually I am working on distributed queries. so to say: distributed on a single host, but the idea is the same for N hosts. The principle i chose for my approach was that the host who has some metadata about a URI can be identified by parsing the URI itself. f.e. if my URI is labeled: http://leo.gnowsis.com/~user/leo I assumed that the host was leo.gnowsis.com. Then at leo.gnowsis.com the host is contacted by a new protocol (f.e. URIQA or Joseki) and PARSES the url, http is not used. After parsing, the host knows which application/database to contact about triples about the resource. This approach is like a Apache server that has modules that can do .php or .aspx includes or a Jetty Servlet server that has a web.xml file where some Servlets are registerred for handling some URL pattern. Exactly this is what I transfered to the gnowsis system, www.gnowsis.com f.e. there is an adapter that can handle MP3 file metadata and when i want to know something about file://leo.gnowsis.com/media/songs/u2-one.mp3 the server finds the url - being on localhost - having "file" scheme - ending with ".mp3" and therefore passes the metadata request on to the Mp3 metadata adapter. Voila, with this approach I have shown how to build a distributed query system very easy. And yes, you are free to program a crawler-robot that follows all links in the semantic web and indexes these google-like. Will be published in January 2004. greetings Leo Sauermann www.gnowsis.com > -----Original Message----- > From: www-rdf-interest-request@w3.org > [mailto:www-rdf-interest-request@w3.org] On Behalf Of Sandro Hawke > Sent: Wednesday, October 22, 2003 3:48 PM > To: www-rdf-interest@w3.org > Subject: Distributed Query (was RDFStyles) > > > > > > One more aspect of RDF that I notice is often forgotten is that it's > > supposed to be distributed (see more below). > > I'm not sure it's "supposed to" be distributed. That's how you and I > want to use it, since we want to use it for the Semantic Web. Is that > all RDF is good for? Maybe.... Should there be separate RDF and > Semantic Web lists? Is the fact the the W3C "Metadata Activity" is > gone and RDF is now part of the W3C "Semantic Web Activity" sufficient > to prove RDF is really just for the Semantic Web? > > Anyway, I don't think it's accidentally forgotten, just > postponed while > people try to figure out the presumably simpler local issues. > > > My favorite screw in need for screwdriver is RDF query (as > opposed to > > RDF transformation): since RDF is really distributed, you are not > > supposed to be able to process the whole problem domain > in-memory and on > > a single host, you're rather supposed to _query_ different remote > > knowledge bases and process _results_ of these queries. > > > > Fetching whole of WordNet, Wikipedia, and DMoz and running an XSLT > > transform on the combined result doesn't fit into the > original vision of > > Semantic Web as I understand it. > > I think this is a hard but wonderful problem. Each RDF document has > lots of URIs you can use as links to find more information. Most > queries you use will also contain URIs you could use. There are two > problems: (1) if you follow them all, recursively, you might soon end > up with a billion pages [this is the "performance" question], and (2) > not all of the information will be true [the "trust" question]. > There's some talk of this on the esw wiki under "Follow Links For More > Information" [1]; I encouraged you to contribute. > > I was recently exploring this in the context of my OWL Test Results > page [2], trying to express in RDF which links the report generator > should follow [3]. The idea is that one CAN follow any link, but > metadata about what you'll find if you do will save you a lot of work. > The metadata which struck me as useful was: what are the classes of > the things named there and what are the properties used in the > statements there. (Use the most-specific subclass and subproperty > which you know to be true. Assume folks will follow links to the > ontology so they'll know this.) I constructed that file (start.rdf) > by hand, but I'd expect it to be constructed by one agent to save all > the other agents a lot of work, kind of like how Google saves each of > us from having to read 3,307,998,701 web pages ourselves. > > Meanwhile, I consider the trust issue completely orthogonal. I hope > to present all fetched results to users along with justification > information, which shows both what sources were used and what kind of > reasoning was used (a la inferenceWeb). If an actual contradiction is > detected, I expect to make some sort of truth maintenance decision and > discard one source, with a warning to the user that the truth > maintenance decision was just a guess. > > -- sandro > > > [1] http://esw.w3.org/topic/FollowLinksForMoreInformation > [2] http://www.w3.org/2003/08/owl-systems/test-results-out > [3] http://www.w3.org/2003/08/owl-systems/start.rdf > [4] http://www.ksl.stanford.edu/software/IW/ >
Received on Wednesday, 22 October 2003 12:35:45 UTC