- From: Lee Feigenbaum <feigenbl@us.ibm.com>
- Date: Thu, 18 Jan 2007 14:21:02 -0500
- To: Jeen Broekstra <j.broekstra@tue.nl>, dawg mailing list <public-rdf-dawg@w3.org>
Thanks, Jeen! My results and comments inline below. Jeen Broekstra wrote on 01/18/2007 10:35:29 AM: > I have replaced the old SyntaxFull test set with the tests from SyntaxDev. > > super-manifest: > > http://www.w3.org/2001/sw/DataAccess/tests/data-r2/manifest-syntax.ttl I don't have code yet that can read these. > Syntax test sets (181 tests in total): > > http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql1/ > http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql2/ > http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql3/ > > None of these tests are currently DAWG-approved, of course. > > Only syntax-sparql3 contains negative syntax tests. They are marked as > such in the manifest, and can also be recognized by the filename, which > is prefixed 'syn-bad-'. > > I would ask everyone with a SPARQL parser to try out these tests and > report possible problems. As I said previously, Glitter throws exceptions during parsing when it encounters a function that it does not recognize. That causes a handful of tests to fail, which I've tried to highlight here. syntax-sparql1 - I fail 4 tests, all because of unknown functions syntax-sparql2 - I fail 6 tests; 4 are because of unknown functions. The other two are: syntax-esc-04 syntax-esc-05 ...both of which have liberal use of \u escapes. It won't surprise me at all if these tests are fine and this is a parser bug that I have. syntax-sparql3 - I fail these 6 tests; syn-bad-{8,9,10,1112,13} - these (negative) test asserts that multiple periods ('.') in a row should fail; my parser allows them. Was there a change at some point in the grammar that affected the validity of extra periods in a row? > I have also ran the set through Sesame's SPARQL parser of course. I get > a number of errors (24) and failures (18), most of which have to do with > our implementation (we currently do not yet support functions and > ordering, and the parser throws exceptions on queries containing those > features). > > I also came across this interesting failure. The following parser test > (syntax-sparql1/syntax-forms-02) fails: > > PREFIX : <http://example.org/ns#> > SELECT * WHERE { ( [] [] ) } > > To be honest I have no idea how to read this query, I would appreciate > insights. Looks like an RDF collection with two blank node elements to me. Let's see how it works in the grammar... It matches [39] Collection first; then two [40] GraphNode productions. Each of those matches a [41] VarOrTerm which matches [44] GraphTerm which matches [65] BlankNode which matches [84] ANON which consumes '[' WS* ']' . > Also: a fair number of the errors Sesame's parser throws have to do with > the queries in the syntax-sparql2 set, which use relative URIs in > queries (e.g. <a>, <b>, <p1>, etc.). A relative URI has to be resolved > against a base URI - which is normally provided using a BASE clause. > However, the queries in this test set do not have such a clause. Andy > has pointed out to me that according to RFC3986 (URI) in such cases the > base URI should be provided by the 'embedding entity', i.e. the location > of the file that contains the query. Sesame's query parser has no > feature for this however: it only accepts a query string as an argument, > a base URI for resolving any relative referencing inside that query can > not be provided seperately. I guess that this is a shortcoming in our > current parser that we should deal with in Sesame. > > However, correct resolution in this fashion is a feature of file > processing, not query parsing, IMHO, and the test set is designed to > test query parsing, not file processing. So I would suggest that we > modify these test cases to have a base URI inside the query. This avoids > having implementations fail tests on this problem. Thoughts? I think this is similar to the undefined functions case, though I'm not sure what we should do about it. They're not parse errors per se; they're evaluation errors that our engines are catching at parse time. I need to think a bit about what I think is the best way to handle these in implementation reporting... > Next up: evaluation tests (which are, after all, the more interesting > test cases ;)). yeah! :-) Lee > > Jeen > -- > Dr. Jeen Broekstra Den Dolech 2 > Information Systems Group HG 7.76 > Department of Mathematics and Computer Science P.O. Box 513 > Technische Universiteit Eindhoven 5600 MB Eindhoven > tel. +31 (0)40 247 36 86 The Netherlands >
Received on Thursday, 18 January 2007 19:21:24 UTC