Re: Example PHP to check the results between 2 trees from Andy Seaborne on 2014-08-27 (public-sparql-dev@w3.org from July to September 2014)

From: Andy Seaborne <andy@apache.org>
Date: Wed, 27 Aug 2014 20:51:52 +0100
To: public-sparql-dev@w3.org
Message-ID: <53FE36D8.40409@apache.org>
On 26/08/14 15:27, Karima Rafes wrote:
> Hello
>
> I try to do a benchmark about the interoperability between the
> triplestores. I have a problem with the tests without "order by" or with
> "select *", etc. My code is in PHP. I search an example of code to check
> 2 trees. Example, the result is correct but the test is false :
> http://sparqlscore.com/ajax_testResultNode.php?graph=https%3A%2F%2Fci.inria.fr%2Fgo3%2Fjob%2FFuseki%2F33%2F&node=http%3A%2F%2Fwww.w3.org%2F2009%2Fsparql%2Fdocs%2Ftests%2Fdata-sparql11%2Faggregates%2Fmanifest%23agg-max-02%2FResponse
>
> NB : for the moment, the site is a beta version... http://sparqlscore.com/
> There are a lot of fixes to do. If you want help the project, you can
> check/fork it on github :
> https://github.com/BorderCloud/TFT
> And you can also add new tests in the project :
> https://github.com/BorderCloud/TFT-tests
>
> Thanks
> Karima


Hi there,

The official implementation report for SPARQL 1.1 that was prepared at 
the end of the standards process is at:
http://www.w3.org/2009/sparql/implementations/

Any of these systems may have useful code for you.

== Result set comparison

To test whether 2 unordered result sets are equivalent there are two 
things needed:

1/ Cope with the rows out of order
2/ Map blank nodes consistently.

There does not seem to be a need to very efficient here because all the 
result sets to compare are small.

The first can be done by:

A/ Check the variables lists in the header are the same.
B/ Check the result set have the same number of rows.
C/ Pick a row from the test results, scan the expected results
    to find a row with same variable/value bindings, and remove
    from the expected results. If all test rows, match then
    (because of B) the result sets have the same rows.

The second part, blank nodes, requires finding a mapping from the labels 
in the test result to the labels in the expected results.

If the rows were in order, this would be easy but if they are not, then 
the test is more complicated.

To compare things with blank nodes in them needs some kind of back 
tracking in the general case where the code has a number of possible 
ways to match a blank node and needs to try each to see it if will work 
for the rest of the testing.

This full back tracking seems to be needed for the SPARQL 1.1 test 
suite.  I found possible one test in the SPARQL 1.1 test suite (test 
"BNODE(Str)" in the "functions" set), and one in the SPARQL 1.0/DAWG 
test suite (dawg-bnode-coreference).  There may be others my code 
investigation didn't show up.

Some of the oldest SPARQL 1.0 tests are based on result sets encoded in 
RDF graphs (this was before there was an XML result format).  They need 
to be tested with graph isomorphism for bNodes.  There are various blogs 
explaining that e.g.
   http://blog.datagraph.org/2010/03/rdf-isomorphism
and a paper:
   http://www.hpl.hp.com/techreports/2001/HPL-2001-293.pdf

== SPARQL 1.1 test suite issues

Test aggregates/agg-empty-group is known to be wrong - the result does 
not have one of the required columns.  If you ignore the header variable 
declaration, it's OK.

The right answers are also there in the next test: agg-empty-group2.srx. 
  (Looks to me like the wrong test got marked up as approved.)

== DAWG/SPARQL 1.0

The DAWG tests will have 4 errors for a SPARQL 1.1 engine:

1+2/ In two places, due to aligning SPARQL and Turtle, the syntax of 
decimal numbers changed.  A trailing dot and no digit is a decimal in 
SPARQL 1.0 e.g. 456. , but it's not in Turtle or SPARQL 1.1

basic/term-6 and basic/term-7

3/ There is one case where SPARQL 1.0 generates an error for a datatype 
of a language tag literal and this becomes "rdf:langString" in SPARQL 
1.1.  This is a legitimate extension point of SPARQL 1.0.

expr-builtin/ dawg-datatype-2

4/ There are a pair of tests to show a known ambiguity in the SPARQL 1.0 
spec.  Either dawg-optional-filter-005-not-simplified or 
dawg-optional-filter-005-simplified must fail.  Same query, same data, 
different answers.  The first is the preferred reading (as noted in the 
SPARQL 1.1 spec).

     Hope that helps
     Andy
Received on Wednesday, 27 August 2014 19:52:22 UTC