- From: Sandro Hawke <sandro@w3.org>
- Date: Sun, 10 May 2009 10:08:03 -0400
- To: Bijan Parsia <bparsia@cs.manchester.ac.uk>
- cc: RIF WG <public-rif-wg@w3.org>
> On 9 May 2009, at 13:40, Sandro Hawke wrote: > [snip] > > Right. The interesting/challening part, I believe, is that Gary wants > > to translate frames to Java objects. Doing so will require some > > cleverness, since there are significant semantics/functionality > > differences, but hopefully will give significant performance gains. > > Ternary predicates are typically not super fast at matching (a,b,?). > > Really? I don't understand. You seem to be disagreeing with my claim, but then, below, you seem to be support it. I'm trying to claim that a naive system which just uses an ordinary rdf/3 predicate is not likely to be able to list all the object for a given subject+predicate very quickly (compared to a less-naive implementation.) > > (For example, SWI-Prolog uses a specially indexed structure for RDF > > triples/quads, because normal predicate indexing is too slow.) > > I wouldn't think that's the issue. Predicate indexing is typically one > of the more optimized part of a Prolog engine (for obvious reasons). A > uniform ternary predicate *defeats* predicate indexing because there's > only one predicate. By default, IIRC and it hasn't changed, most > Prolog engines do predicate plus first argument indexing, though you > can change that, e.g.,: > http://www.lix.polytechnique.fr/~catuscia/teaching/prolog/Manual/sec-3. > 11.html#sec > :3.11.1 > http://www.lix.polytechnique.fr/~catuscia/teaching/prolog/Manual/sec-3. > 12.html#index/ I can't find a web page describing it, but from what I recall, the SWI Prolog RDF store (which is really a quad store) has about 8 hash indexes, convering different combinations of subject, predicate, object, and graph. So you get hash-table performance asking for all the objects for given values of subject, predicate, graph; and you also get hash-table performance asking for all the predicates and graphs for a given subject and object combination, etc. (The first example there seems like the most common case; the second example among the least common.) Obviously there are penalties in space and in insertion time. (As I vaguely recall, there's a reasonable amortized cost on deletion, done in a garbage collection style.) I tend to think having a hash table per object/frame gives good enough performance, but it is cool to be able to do various other queries quickly as well. -- Sandro
Received on Sunday, 10 May 2009 14:08:12 UTC