- From: Sandro Hawke <sandro@w3.org>
- Date: Sun, 10 May 2009 10:08:03 -0400
- To: Bijan Parsia <bparsia@cs.manchester.ac.uk>
- cc: RIF WG <public-rif-wg@w3.org>
> On 9 May 2009, at 13:40, Sandro Hawke wrote:
> [snip]
> > Right. The interesting/challening part, I believe, is that Gary wants
> > to translate frames to Java objects. Doing so will require some
> > cleverness, since there are significant semantics/functionality
> > differences, but hopefully will give significant performance gains.
> > Ternary predicates are typically not super fast at matching (a,b,?).
>
> Really?
I don't understand. You seem to be disagreeing with my claim, but then,
below, you seem to be support it. I'm trying to claim that a naive
system which just uses an ordinary rdf/3 predicate is not likely to be
able to list all the object for a given subject+predicate very quickly
(compared to a less-naive implementation.)
> > (For example, SWI-Prolog uses a specially indexed structure for RDF
> > triples/quads, because normal predicate indexing is too slow.)
>
> I wouldn't think that's the issue. Predicate indexing is typically one
> of the more optimized part of a Prolog engine (for obvious reasons). A
> uniform ternary predicate *defeats* predicate indexing because there's
> only one predicate. By default, IIRC and it hasn't changed, most
> Prolog engines do predicate plus first argument indexing, though you
> can change that, e.g.,:
> http://www.lix.polytechnique.fr/~catuscia/teaching/prolog/Manual/sec-3.
> 11.html#sec
> :3.11.1
> http://www.lix.polytechnique.fr/~catuscia/teaching/prolog/Manual/sec-3.
> 12.html#index/
I can't find a web page describing it, but from what I recall, the SWI
Prolog RDF store (which is really a quad store) has about 8 hash
indexes, convering different combinations of subject, predicate, object,
and graph. So you get hash-table performance asking for all the objects
for given values of subject, predicate, graph; and you also get
hash-table performance asking for all the predicates and graphs for a
given subject and object combination, etc. (The first example there
seems like the most common case; the second example among the least
common.) Obviously there are penalties in space and in insertion time.
(As I vaguely recall, there's a reasonable amortized cost on deletion,
done in a garbage collection style.)
I tend to think having a hash table per object/frame gives good enough
performance, but it is cool to be able to do various other queries
quickly as well.
-- Sandro
Received on Sunday, 10 May 2009 14:08:12 UTC