- From: Dave Beckett <dave@dajobe.org>
- Date: Sun, 05 Feb 2006 15:14:52 -0800
- To: Garrett Wollman <wollman+semantic-web@bimajority.org>
- CC: semantic-web@w3.org
Garrett Wollman wrote: > In my continuing project to develop a search facility for my photo > galleries using semweb technology, I've been having great difficulty > finding a query mechanism that can answer simple queries about a small > database in a reasonable length of time (i.e., seconds, not > dekaseconds). I have a small store of some 27,800 triples, containing > depiction information about my photo galleries. I'm trying to compute > something similar to the following SPARQL query (but with more detail > about each photo): > > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > PREFIX photo: <http://www.holygoat.co.uk/owl/2005/05/photo/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > > SELECT DISTINCT ?photo, ?name > WHERE > { > ?photo rdf:type photo:ImageFile ; > foaf:depicts ?b . > ?b rdf:type foaf:Person ; > foaf:name ?name > } > > Executing this query on a Redland "hashes" triple store takes at least > five CPU-minutes (that's the point at which I interrupted it) using > rdfproc(1). Strangely, executing it against pre-serialized RDF takes > only 51 CPU-seconds using roqet(1). Doing a similar query on a 50% > faster machine using cwm's "--strings" option takes about the same > time. > > I see these demo pages on the Web and they don't take that long to > compute a very similar query on much larger databases. What are they > doing that I'm not? The web demo uses rasqal 0.9.11 with redland on a memory based store (with no indexing) so it's unlikely to be that. 5 minute queries usually means something went wrong, and as you don't give the full query, I'm not clear what it could be. One possibility is - and redland/rasqal doesn't test this yet - is that the triple patterns of the query don't connect up (are two separate graphs), so it scans the entire store multiple times in an attempt to get the answer. [If it was SQL it would be a join where none of the variables are shared between tables] Also DISTINCT had bug fixes and improvements in rasqal 0.9.11 so I assume you are using that. If you want to give more info, please send the full query & data and/or use the issue tracker at http://bugs.librdf.org/ Dave
Received on Sunday, 5 February 2006 23:15:03 UTC