Possible Bug in SPARQL 1.1 Protocol Validator

Hi All

First off thanks for the excellent job you've all done in putting together
the SPARQL 1.1 specifications and for the comprehensive and substantial
test suite you've provided.  I look forward to SPARQL 1.1 becoming an
official W3C recommendation in the very near future

However I have encountered what may prove to be a bug in the SPARQL 1.1
Protocol Validator.  When I previously reported my results for this to
Gregory Williams for inclusion in the implementation report I had 4 tests
failing.  As my implementation has a very Windows centric environment
it was difficult for me to debug with the test runner as is so I ported
the problem tests to
Java - https://bitbucket.org/dotnetrdf/sparql11-protocol-validator/overview

Once I did this I found that there were some bugs in my SPARQL engine but
that my protocol implementation appeared to be fine, with the bugs in my
engine fixed the four failing tests now passed.  Yet when I run using the
official validator the tests still fail, specifically:

update_dataset_default_graph
update_dataset_default_graphs
update_dataset_named_graphs
update_dataset_full

After some digging in the Perl code I have identified what might be the
root cause of the problem, in the relevant tests the URIs are created like
so:

POST("${uurl}?using-graph-uri=http%3A%2F%2Fkasei.us%2F2009%2F09%2Fsparql%2F
data%2Fdata1.rdf&using-named-graph-uri=http%3A%2F%2Fkasei.us%2F2009%2F09%2F
sparql%2Fdata%2Fdata2.rdf", [
						'update' => $sparql,
					]);

This appears to result in double encoding of the using-graph-uri and
using-named-graph parameters and since .Net only decodes the parameters
once for me (and I am clearly not going to decode them multiple times) the
SPARQL Updates end up not creating the expected data because the graph
URIs are incorrect.

In my Java harness it is passing the unencoded form through Apache HTTP
Client and this encodes the URI only once so I get the
correct URI on the server side and the tests pass.

I verified that double encoding does indeed appear to be the root cause of
the problem by replacing the unencoded form with the encoded form in my
Java test harness and then the tests start failing.

So it looks like the Perl code should be as follows:

POST("${uurl}?using-graph-uri=http://kasei.us/2009/09/sparql/
data/data1.rdf&using-named-graph-uri=http://kasei.us/2009/09/sparql/data/da
ta2.rdf", [
						'update' => $sparql,
					]);

I.e. the URLs should not be encoded as LWP should take care of this
automatically AFAICT

However I am not 100% certain that double encoding is the issue because
other implementations like Fuseki seem to be totally fine.

I have spent some time trying to get the protocol validator running in an
Apache instance on my OS X laptop but have had little luck.  There is an
apparent undeclared dependency on TryCatch which won't install properly
under OS X for reasons unbeknownst to me and after forcing install the
script just fails to run with a vague and unhelpful compilation error in
the Apache logs.  Knowing next to nothing about Apache and Perl I'd rather
that someone who had a good environment to start with tried out making my
suggested changes and running against my implementation to see if
everything then passes.

Also if someone can look at my Java ports of the tests in question and
check I haven't made an error in porting the tests that would be
appreciated.

For reference the live endpoints for my installation are as follows:

Query - http://www.dotnetrdf.org/demos/server/query
Update - http://www.dotnetrdf.org/demos/server/update

Best Regards,

Rob Vesse

Received on Monday, 17 December 2012 23:05:02 UTC