- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Mon, 11 Oct 2004 09:44:50 +0100
- To: Steve Harris <S.W.Harris@ecs.soton.ac.uk>
- CC: DAWG public list <public-rdf-dawg@w3.org>
One of the use cases we have been discussing is the property disjunction case. When used multiple times in the query, the rewritten form gets long and the transformation by a human will be error prone. I advocate making it clear for the application writer. I strongly prefer such burden is not put on application writers unnecesarily. Unusual ways to express what the application writer intended increases support costs and can make some implementations harder. Consider: SELECT * WHERE (?x dc10:title ?title) OR (?x dc11:title ?title) (?x dc10:creator ?author) OR (?x dc11:creator ?author) Suppose this is a library with its own accession system and the query is doing the simple thing of accession number to title and author SELECT ?title ?author WHERE (?x lib:accession ?n) (?n lib:number "123.456.789") { (?x dc10:title ?title) OR (?x dc11:title ?title) } { (?x dc10:creator ?author) OR (?x dc11:creator ?author) } This is not a long query - the extraction parts of queries can be many fields and I have seen queries a page long. This is going to turn into, maybe: SELECT ?title ?author WHERE (?x lib:accession ?n0) (?n0 lib:number "123.456.789") (?x dc10:title ?title) (?x dc10:creator ?author) UNION (?x lib:accession ?n1) (?n1 lib:number "123.456.789") (?x dc11:title ?title) (?x dc10:creator ?author) UNION (?x lib:accession ?n2) (?n2 lib:number "123.456.789") (?x dc11:title ?title) (?x dc10:creator ?author) UNION (?x lib:accession ?n3) (?n3 lib:number "123.456.789") (?x dc11:title ?title) (?x dc11:creator ?author) This rewrite is a simple algorithm for a machine but error prone for application writers. There is a repeated part each time. It is a simple matter for an implementation to the rewrites, but not for the human application writer. And there are alternatives: if an implmentation spots that ?x is expected to be defined once or just a few times (e.g. lib:number is an IFP) an execution form of (?x ?p ?title) AND ?p == dc10:title | ?p == dc11:title might be preferrable. Recovering the underlying structure can harder to implement. Finding a repeated part is harder - compiler optimization algorithms should find this but it is less trivial than the initial rewrite. Early implementations or ones aiming for a small footprint may be quite naive in executing such a query - later or ones targeted at high performance may wish to adopt different approaches based on query and data. > The query evaluator can simply execute each match in turn, or it may > optimise it if it wishes to get more performance. It is a balance here - the nested form can be turned into a sequence of SQL expressions if the implementation so desires. That is not a hard implementation. The converse is not true - turning the exapanded form into alternative executiuon forms is not easy. Andy Steve Harris wrote: > Another systatic alternative to the inline disjunctive exressions I > dislike so much :) is UNION queries, which have been discussed a little, > as "multiple queries per request", "multiple WHEREs" or something similar. > > So > > SELECT ?name > WHERE { (?x foo:name ?name) } > OR { (?x bar:hasName ?name) } > > could be writen as: > > SELECT ?name > WHERE (?x foo:name ?name) > UNION > WHERE (?x bar:hasName ?name) > > or some other equivalent (you wany want to repeat the SELECT). > > This has the advantage over OR that it doesnt explode the query > complexity, and doesnt have such complicated interactions with OPTIONAL, > as every allowed combination of disjunctive expressions has to be spellt > out. The query evaluator can simply execute each match in turn, or it may > optimise it if it wishes to get more performance. > > The downside is that it will make the queries more verbose if you have a > complicated disjunctive expression. > > - Steve >
Received on Monday, 11 October 2004 08:45:24 UTC