- From: Paul Gearon <gearon@ieee.org>
- Date: Thu, 29 Apr 2010 12:47:44 -0400
- To: Parsa Ghaffari <parsa.ghaffari@gmail.com>
- Cc: public-sparql-dev@w3.org
On Thu, Apr 29, 2010 at 12:22 PM, Parsa Ghaffari <parsa.ghaffari@gmail.com> wrote: > Hi there, > How different are these queries in terms of performance: > 1) SELECT * WHERE { ?x foaf:name "John Smith"@en } > > 2) SELECT * WHERE { ?x foaf:name ?y. FILTER regex(?y,"^John Smith") } This will depend on the exact implementation, but 1 will be faster. Most databases will index by object. This means that (1) will usually be a simple index lookup. For a standard system with tree-based indexes, this will have logarithmic complexity (O(log(N)). For in-memory systems with hashed indexes, this will be constant (O(1)). (2) will require a lookup (as in (1)), followed by a filter operation. This will require N operations (linear or O(N)) multiplied by the complexity of the regex (which can be ignored for complexity since it is simple, but will have some real cost). The overall complexity is linear: O(N). This is much worse than for (1). It *is* possible for a clever implementation to see that the filter is a very simple operation that can be duplicated with a lookup in an index that sorts lexically. So if your store is the right kind, and you have a clever optimizer, it can be just as efficient as for (1). In reality, I don't believe any optimizers are quite this clever (someone, please tell me if I'm wrong here!) so you won't that level of performance. There are databases that offer alternatives to (2), which let you specify that you're searching lexically through an index, thereby doing an equivalent operation to (2), while offering the performance of (1). However, that is highly implementation dependent, and the mechanism for saying what you want in the query is not standardized. Regards, Paul Gearon
Received on Thursday, 29 April 2010 16:48:18 UTC