W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > December 2012

Re: Possible Bug in SPARQL 1.1 Protocol Validator

From: Rob Vesse <rvesse@dotnetrdf.org>
Date: Wed, 19 Dec 2012 11:28:57 -0800
To: Gregory Williams <greg@evilfunhouse.com>
CC: <public-rdf-dawg-comments@w3.org>
Message-ID: <CCF75579.1A9D2%rvesse@dotnetrdf.org>
Hi Gregory

Comments inline:

On 12/18/12 1:56 PM, "Gregory Williams" <greg@evilfunhouse.com> wrote:

>On Dec 17, 2012, at 6:03 PM, Rob Vesse wrote:
>
>> After some digging in the Perl code I have identified what might be the
>> root cause of the problem, in the relevant tests the URIs are created
>>like
>> so:
>> 
>> 
>>POST("${uurl}?using-graph-uri=http%3A%2F%2Fkasei.us%2F2009%2F09%2Fsparql%
>>2F
>> 
>>data%2Fdata1.rdf&using-named-graph-uri=http%3A%2F%2Fkasei.us%2F2009%2F09%
>>2F
>> sparql%2Fdata%2Fdata2.rdf", [
>> 						'update' => $sparql,
>> 					]);
>> 
>> This appears to result in double encoding of the using-graph-uri and
>> using-named-graph parameters and since .Net only decodes the parameters
>> once for me (and I am clearly not going to decode them multiple times)
>>the
>> SPARQL Updates end up not creating the expected data because the graph
>> URIs are incorrect.
>
>
>Hi Rob,
>
>Where are you seeing the double encoding? I'm able to take that POST
>line, run it, and see this on the server side:
>
>------------
>POST 
>/?using-graph-uri=http%3A%2F%2Fkasei.us%2F2009%2F09%2Fsparql%2F%20data%2Fd
>ata1.rdf&using-named-graph-uri=http%3A%2F%2Fkasei.us%2F2009%2F09%2F%20spar
>ql%2Fdata%2Fdata2.rdf HTTP/1.1
>TE: deflate,gzip;q=0.3
>Connection: TE, close
>Host: localhost:8881
>User-Agent: libwww-perl/5.834
>Content-Length: 27
>Content-Type: application/x-www-form-urlencoded
>
>update=sparql+update+string
>------------
>
>Do you believe this is wrongly encoded? Given that there are several
>implementations passing the protocol tests using this validator (I know
>of ones in perl, java, and c++), I believe the problem may lie elsewhere.

It's my best guess at what the problem might be given that I have
eliminated all other obvious explanations to the best of my ability.  To
clarify I have done the following:

1 - Running the command sequences manually through my web UI - All Pass
2 - Running the command sequences in those tests using CURL - All Pass
(See 
https://bitbucket.org/dotnetrdf/sparql11-protocol-validator/src/tip/protoco
l.sh?at=default)
3 - Running my Java ports of those tests - All Pass
4 - Running unit test versions of the command sequences I.e. eliminating
any protocol interaction and adjusting the commands to add the USING/USING
NAMED statements that the  protocol should be adding - All Pass

While I could have ported the tests incorrectly once, four times starts to
seem a little unlikely, once I could do something dumb but believe me I've
spent a lot of time staring at these tests already.  So either the test
harness is bad or my implementation is bad (or I really suck at copy and
paste), given that I can get the tests to run successfully in four other
ways I tend to lean towards some oddity in the test harness.

It may be double encoding or perhaps the tests that the harness runs
aren't exactly the same as the tests as documented in the ReadMe (which as
far as I can see is not the case)?

Debugging this with the official harness is a PITA for me because I can't
debug my live implementation using the public instance of the test harness
and since I can't get the test harness to install and run locally yet I am
rather stuck.

I am not ruling out a bug in my implementation but it's hard to know where
to look given all my ported versions of the tests pass and the difficulty
of quickly running the tests in a usable debugging environment for my
implementation.

Rob

>
>> In my Java harness it is passing the unencoded form through Apache HTTP
>> Client and this encodes the URI only once so I get the
>> correct URI on the server side and the tests pass.
>> 
>> I verified that double encoding does indeed appear to be the root cause
>>of
>> the problem by replacing the unencoded form with the encoded form in my
>> Java test harness and then the tests start failing.
>
>This sounds like it might be a difference between the perl and java http
>library APIs...?
>
>> So it looks like the Perl code should be as follows:
>> 
>> POST("${uurl}?using-graph-uri=http://kasei.us/2009/09/sparql/
>> 
>>data/data1.rdf&using-named-graph-uri=http://kasei.us/2009/09/sparql/data/
>>da
>> ta2.rdf", [
>> 						'update' => $sparql,
>> 					]);
>> 
>> I.e. the URLs should not be encoded as LWP should take care of this
>> automatically AFAICT
>
>That does not seem to be the case with the version of LWP I am using.
>
>
>> However I am not 100% certain that double encoding is the issue because
>> other implementations like Fuseki seem to be totally fine.
>> 
>> I have spent some time trying to get the protocol validator running in
>>an
>> Apache instance on my OS X laptop but have had little luck.  There is an
>> apparent undeclared dependency on TryCatch which won't install properly
>> under OS X for reasons unbeknownst to me and after forcing install the
>> script just fails to run with a vague and unhelpful compilation error in
>> the Apache logs.
>
>Yes, TryCatch is required, as is Plack. I can update the documentation,
>but not sure how else to debug the problem as it works locally for me,
>and is working on my server where the validator is being hosted.
>
>.greg
>
Received on Wednesday, 19 December 2012 19:30:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 19 December 2012 19:30:20 GMT