- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Fri, 20 Oct 2006 14:18:02 +0100
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
This message is an outline for an operational definition of the SPARQL algebra. There is advocacy for this change in the working group; this message is to check that with everyone; to get early feedback, and to check we are agreed on what it means. The style of approach isn't new - Fred's work embodies this style as does the Chilean paper. This is not about syntax at this stage. == Preliminaries BGP matching: this is taken as an input and we assume that any BGP matches the target graph and generates solutions with every mentioned named query variable having a binding. We continue to allows for different entailment regimes and specifically use with OWL-DL. There is no assumptions about blank nodes and mappings (which only affects duplicates). The output of a evaluating a BGP is a collection of solutions. == Outline The SPARQL algebra consists of the following operators for manipulating collections of solutions (type R): Join(R,R) -> R LeftJoin(R,R) -> R Union(R,R) -> R filter(R) -> R I have no particular attachment to the choice of operator names. In particular, calling it "optional", not "left join" might be better, Join and left join need to be defined for the SPARQL case because they are not exactly as defined for SQL or in a relational algebra. Evaluation of an expression is functional (AKA bottom-up; constructional (Fred's work [1] and as worked out by Jorge et al [2]). UNION is union Filters go at the end of groups conceptually (i.e. after compositional operations). OPTIONAL is left-join. A group is a sequence of the join/left-join the filters of its elements. Elements are already evaluated. Evaluation in a group is ordered left to right (left-nested; left-associative). Using {} will change the order of evaluation if required. == Some interesting queries: Some interesting queries to consider: not problems or show stoppers, just interesting. {} By pure construction, this is could be considered to be empty (no construction). So inserting it into a pattern would be an operation on an empty solution collection resulting in an empty solution. Currently, we define it to have one solution with no bindings - we should continue to do so. { OPTIONAL {...} } evaluates as { {} OPTIONAL {...} } The principle is that every group has at least the unit table in it or that every group has the unit table on the left hand side. { FILTER(expr) } evaluates as { {} FILTER(expr) } which means that { FILTER(expr) } does not propagate the filter effect. This is a change from the declarative approach (top-down) where any pattern applies the whole solution being considered, not the part being constructed. { pattern[?x] OPTIONAL { pattern[!?x] OPTIONAL { pattern[?x] } } [?x] means mentions ?x, [!?x] means does not mention ?x. Like the filter case above, this is a change from the declarative approach. Discussed in Jorge's paper and (for example) [3]. { { P1 FILTER(expr) } P2 } evaluates the filter before P2. Might matter if filter has a !bound() in it or relies on a bound variable, and P2 mentions that variable and p1 does not. == Notes The domain of solutions naturally emerges (it happens as a result of the definitions) because the constructional nature. No need for Chilean proposal 4 for additional restrictions - which I feel is too strong anyway (example below): SELECT * { ?x rdf:type skos:Concept . OPTIONAL { ?x skos:prefLabel ?label } OPTIONAL { ?x skos:altLabel ?label } } which gets labels in preferred order. Andy [1] http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/0033 [2] http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2006Jun/0008 [3] http://tech.groups.yahoo.com/group/jena-dev/message/25717?var=0&l=1
Received on Friday, 20 October 2006 13:18:48 UTC