- From: Lee Feigenbaum <feigenbl@us.ibm.com>
- Date: Thu, 18 Jan 2007 14:21:02 -0500
- To: Jeen Broekstra <j.broekstra@tue.nl>, dawg mailing list <public-rdf-dawg@w3.org>
Thanks, Jeen! My results and comments inline below.
Jeen Broekstra wrote on 01/18/2007 10:35:29 AM:
> I have replaced the old SyntaxFull test set with the tests from
SyntaxDev.
>
> super-manifest:
>
> http://www.w3.org/2001/sw/DataAccess/tests/data-r2/manifest-syntax.ttl
I don't have code yet that can read these.
> Syntax test sets (181 tests in total):
>
> http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql1/
> http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql2/
> http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql3/
>
> None of these tests are currently DAWG-approved, of course.
>
> Only syntax-sparql3 contains negative syntax tests. They are marked as
> such in the manifest, and can also be recognized by the filename, which
> is prefixed 'syn-bad-'.
>
> I would ask everyone with a SPARQL parser to try out these tests and
> report possible problems.
As I said previously, Glitter throws exceptions during parsing when it
encounters a function that it does not recognize. That causes a handful of
tests to fail, which I've tried to highlight here.
syntax-sparql1 - I fail 4 tests, all because of unknown functions
syntax-sparql2 - I fail 6 tests; 4 are because of unknown functions. The
other two are:
syntax-esc-04
syntax-esc-05
...both of which have liberal use of \u escapes. It won't surprise me at
all if these tests are fine and this is a parser bug that I have.
syntax-sparql3 - I fail these 6 tests;
syn-bad-{8,9,10,1112,13} - these (negative) test asserts that multiple
periods ('.') in a row should fail; my parser allows them.
Was there a change at some point in the grammar that affected the validity
of extra periods in a row?
> I have also ran the set through Sesame's SPARQL parser of course. I get
> a number of errors (24) and failures (18), most of which have to do with
> our implementation (we currently do not yet support functions and
> ordering, and the parser throws exceptions on queries containing those
> features).
>
> I also came across this interesting failure. The following parser test
> (syntax-sparql1/syntax-forms-02) fails:
>
> PREFIX : <http://example.org/ns#>
> SELECT * WHERE { ( [] [] ) }
>
> To be honest I have no idea how to read this query, I would appreciate
> insights.
Looks like an RDF collection with two blank node elements to me. Let's see
how it works in the grammar...
It matches [39] Collection first; then two [40] GraphNode productions.
Each of those matches a [41] VarOrTerm which matches [44] GraphTerm which
matches [65] BlankNode which matches [84] ANON which consumes '[' WS* ']'
.
> Also: a fair number of the errors Sesame's parser throws have to do with
> the queries in the syntax-sparql2 set, which use relative URIs in
> queries (e.g. <a>, <b>, <p1>, etc.). A relative URI has to be resolved
> against a base URI - which is normally provided using a BASE clause.
> However, the queries in this test set do not have such a clause. Andy
> has pointed out to me that according to RFC3986 (URI) in such cases the
> base URI should be provided by the 'embedding entity', i.e. the location
> of the file that contains the query. Sesame's query parser has no
> feature for this however: it only accepts a query string as an argument,
> a base URI for resolving any relative referencing inside that query can
> not be provided seperately. I guess that this is a shortcoming in our
> current parser that we should deal with in Sesame.
>
> However, correct resolution in this fashion is a feature of file
> processing, not query parsing, IMHO, and the test set is designed to
> test query parsing, not file processing. So I would suggest that we
> modify these test cases to have a base URI inside the query. This avoids
> having implementations fail tests on this problem. Thoughts?
I think this is similar to the undefined functions case, though I'm not
sure what we should do about it. They're not parse errors per se; they're
evaluation errors that our engines are catching at parse time. I need to
think a bit about what I think is the best way to handle these in
implementation reporting...
> Next up: evaluation tests (which are, after all, the more interesting
> test cases ;)).
yeah! :-)
Lee
>
> Jeen
> --
> Dr. Jeen Broekstra Den Dolech 2
> Information Systems Group HG 7.76
> Department of Mathematics and Computer Science P.O. Box 513
> Technische Universiteit Eindhoven 5600 MB Eindhoven
> tel. +31 (0)40 247 36 86 The Netherlands
>
Received on Thursday, 18 January 2007 19:21:24 UTC