Three differnent semantics

Hi all,

The current specification is contradictory. It states that we have to use substitution and that
sub selects are evaluated logically first. Moreover, substitution cannot be done when it
breaks the semantics.

To solve these issues we need a clear and well defined semantics for the evaluation of a graph
pattern or expression against a solution mapping. Currently, we are focusing on the semantics of
exists, that in my opinion, is a paradigmatic case of evaluation against a solution mapping.

I submitted a second version of "Correlation and Substitution in SPARQL" [1], a technical report
where we address this problem. This new version fix several errors of the previous one and
proposed three alternative semantics for correlation and substitution.

At the end of this report we show ten queries to exemplify how each alternative semantics works.
Then, we run this queries against the implementations: Blazegraph, Fuseki, Virtuoso and rdf4j
(formerly Sesame). We summarize these comparisons in a table at the end of the section.

This experiment shows us that:

1. Blazegraph and Fuseki are aligned with the semantics that does not apply substitution.
Blazegraph matches the semantics that we called S1 in every experiment. Fuseki differs in
one experiment, but I guess that the mismatch was produced by a bug.

 2. rdf4j substitute every variable that is not in the domain of the graph pattern. It matches
the semantics that we called S3 in all the experiments.

3.  Virtuoso applies substitution in several cases, but it does not match any of the semantics
that we proposed. This does not means that we believe that Virtuoso is wrong, but that
we have not understand yet what is the semantics that explains the Virtuoso behavior.

I hope that our report will contribute to the understanding of alternative semantics and
the assumptions done by implementers.

Daniel

[1]: http://arxiv.org/abs/1606.01441v2

Received on Wednesday, 13 July 2016 01:44:22 UTC