FYI: SPARQL Algebra, Reductions, Forms and Mappings for Implementations from Chimezie Ogbuji on 2007-04-01 (public-sparql-dev@w3.org from April to June 2007)

From: Chimezie Ogbuji <chimezie@gmail.com>
Date: Sun, 1 Apr 2007 16:45:35 -0400
To: jperez@ing.puc.cl
Cc: dev@rdflib.net, "Ivan Herman" <ivan@ivan-herman.net>, public-sparql-dev@w3.org
Message-ID: <f6ec8dcb0704011345m219b8848k5dbf3d8e8ecd117@mail.gmail.com>

Jorge.

I've been gearing up to an attempt at implementing the Compositional
SPARQL semantics expressed in both the 'Semantics of SPARQL' and
'Semantics and Complexity of SPARQL' papers with the goal of reusing
existing sparql-p which already implements much of the evaluation
semantics.  Some intermediate goals are were neccessary for the first
attempt at such a design [1]:

- Incorporate rewrite rules outlined in the current DAWG SPARQL WD
- Incorporate reduction to Disjunctive Normal Form outlined in
'Semantics and Complexity of SPARQL'
- Formalize a mapping *from* the DAWG algebra notation to that
outlined in 'Semantics of SPARQL'
- Formalize a mapping *from* the compositional semantics to sparql-p methods

In attempting to formalize the above mappings I noticed some
interesting parallels that I thought you and Ivan might be interested
in (given the amount independent, effort that was put into both the
formal semantics and the implementations).  In particular

The proposed disjunctive normal form of SPARQL patterns coincides
directly with the 'query' API of sparql-p [2] which essentially
implements evaluation of SPARQL patterns of the form:

(P1 UNION P2 UNION .... UNION PN) OPT A) OPT B) ... OPT C)

I.e., DNF extended with OPTIONAL patterns.

In addition, I had suggested [3] to the DAWG that they consider
formalizing a function symbol which relates a set of triples to the
IRIs of the graphs in which they are contained.  As Richard Newman
points out, this is implemented [4] by most RDF stores and in RDFLib
in particular by the ConjunctiveGraph.contexts method:

contexts((s,p,o)) -> {uri1,uri2,...}

I had asked their thoughts on performance impact on evaluating GRAPH
patterns declaratively instead of imperatively (the way they are
defined in both the DAWG semantics and the Jorge P. et. al papers) and
I'm curious on your thoughts on this as well.

Finally, an attempt at a formal mapping from DAWG algebra evaluation
operators to the operators outlined in the Jorge P.et. al papers is
below:

merge(μ1,μ2)             = μ1 ∪ μ2
Join(Omega1,Omega2)      = Filter(R,Omega1 ⋉ Omega2)
Filter(R,Omega)          = [[(P FILTER R)]](D,G)
Diff(Omega1,Omega2,R)    = (Omega1 \ Omega2) ∪ {μ | μ in Omega1 ⋉
Omega2 and *not* μ |= R}
Union(Omega1,Omega2)     = Omega1 ∪ Omega2

Your comments would be greatly appreciated as valuable feedback in
trying to reconcile the two (somewhat) separate formal semantics with
*actual* implementations sparql-p and RDFLib.

[1] https://svn.rdflib.net/trunk/rdflib/sparql/bison/CompositionalEvaluation.py
[2] http://dev.w3.org/cvsweb/~checkout~/2004/PythonLib-IH/Doc/Attic/pythondoc-sparql.html?rev=1.5&content-type=text/html;%20charset=iso-8859-1#sparql.SPARQL.query-method
[3] http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2007Mar/0000.html
[4] http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2007Mar/0001.html

- Chimezie Ogbuji

Received on Sunday, 1 April 2007 20:45:39 UTC