- From: Dave Beckett <dave@dajobe.org>
- Date: Sun, 05 Dec 2010 16:15:27 -0800
- To: public-rdf-dawg-comments@w3.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 These are my personal comments (not speaking for any past or current employer) on: SPARQL 1.1 Query Language W3C Working Draft 14 October 2010 http://www.w3.org/TR/2010/WD-sparql11-query-20101014/ My comments are based on the work I did to add some SPARQL 1.1 query and update support to my Rasqal rdf query library (engine and API) http://librdf.org/rasqal/ in version 0.9.21 just released 2010-12-04 as announced at: http://lists.w3.org/Archives/Public/semantic-web/2010Dec/0055.html Some background to my work is given in a blog post at http://journal.dajobe.org/journal/posts/2010/10/24/writing-an-rdf-query-engine-twice/ I. General comments I felt the specification introduced more optional features bundled together, where it was not entirely clear what the combination of those features would do. For example a query with no aggregate expression but has a GROUP BY and HAVING is allowed by the syntax and the main document doesn't say if it's allowed or what it means. I found it hard to assemble all the pieces from the mathematical explanations into something I could code. The spec has several terms in the grammar not in the query document. After asking, these turned out to be federated query (BINDINGS), or update (LOAD, ...) but these are not pointed out or linked to clearly although there is mention of the documents in the status section. Please make these more clear. I decided to concentrate on the new Aggregates feature since I had already implemented SELECT expressions, leaving Subqueries and Negation to later. Property paths should be in the list of new features in the status section at the of the document. "SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs" is rather a long title; what does 'Uniform' or 'HTTP' add? SOAP is dead. suggest "SPARQL 1.1 RDF Graph Management Protocol" or RDF dataset With all the additions especially property paths (a new query language), update (data management language) and federated query (remote query support) and I understand ~30 additional keywords are being added beyond this draft for functions and operators, I see this as a major change to SPARQL 1.0, more of a SPARQL 2. You should consider renaming it. II. Aggregates Found the math in the aggregation and grouping sections rather hard to understand so I also looked what MySQL and SQLite did, and wrote my own diagram based on the data flow http://www.dajobe.org/2009/11/sparql11/ so for me it was easier to see the individual components/stages (which roughly correspond to SPARQL algebra terms). I had to make several of my own tests with my guess on what the answers should be. With all the pieces for aggregate expressions: grouping, aggregate expression, distinct, having, counting (count * vs count(expr)) there needs to be several tests with good coverage. I felt aggregate functions can be broken down into these parts 1. selecting of aggregate function value 2. grouping of results - optional; explicit, implicit when agg func present 3. execution of aggregate functions - optional; with some special cases 4. filtering of group results with having - optional (following my diagram above) As it is clear they are all optional, it probably is worth explaining what it means when they are absent, such as group by + having with no aggregate expression as mentioned above. III. Bindings is a new syntax BINDINGS essentially gives a new way to write down a variable bindings result set. Even though it is discussed in the federated query spec about using it for SERVICE, it's not restricted to that by the grammar or specifications. BINDINGS in the query grammar: http://www.w3.org/TR/2010/WD-sparql11-query-20101014/#rBindingsClause I previously asked about on 2010-10-15 at: http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2010Oct/0044.html and this comment is an extension of that comment. So as I read it this is a valid 'query' which does no real execution but just returns a set. SELECT * WHERE BINDINGS ?var1 ?var2 { ( "var1-value1" "var2-value1" ) ( "var1-value2" "var2-value2" ) } or if you really must you can leave out the WHERE: SELECT * BINDINGS { ( "var1-value1" "var2-value1" ) ( "var1-value2" "var2-value2" ) } My question is to ask if this is correct and to clarify in the spec the intended use, whether or not it is intended for use with SERVICE only. IV. Section-by-section comments Section: Status of this Document Should mention property paths as new since that is a major addition after SPARQL 1.0 Please link to the documents in the status, these are just text. Sections 1-8 Skipped, they are same as SPARQL 1.0 I hope 9 Property Paths I am unlikely to ever implement any of this, it's a second query language inside SPARQL. How many systems implemented this before the SPARQL 1.1 work was started? 10 Aggregates I took all the examples in this section and turned them into test cases where possible. 10.2 The explanation of errors and ListEvalE is rather opaque. It is still not clear to me what is done with errors in GROUP BY, HAVING and arguments to aggregate expressions. Some are skipped, some are ignored and return NULL. Examples and tests will enable checking this but the spec needs to be clearer. Definition: Group and Aggregation were hard for me to understand. The input to Aggregation being a 'scalar' meaning actually a set of key:value pairs was confusing. It is not also not clear if those are a set or an ordered set of parameters. This is only used today for the 'separator' with GROUP_CONCAT. 10.2.1 HAVING What happens when there is an expression error? What variables and expressions can be used here and what is their scope? 10.2.2 Set Functions Another confusing section. I mostly ignored this and did what SQL did. None of the functions that I can tell, ever use 'err'. 10.2.3 Mapping from Abstract Syntax to Algebra scalarvals argument is used here - I think this is called 'scalar' earlier. Un-numbered Section after 10.2.3: Joining Aggregate Values Never figured out what this was trying to define but my code executes the example. 11. Subqueries (Ignored in my current work) 12 RDF Dataset (Same as SPARQL 1.0 I assume so no comments) 13 Basic Federated Query Yes, please merge in the text here. 14 Solution Sequences and Modifiers ( Aside: This is one of those SPARQL parts where everything mentioned is optional. Otherwise this section has no change from SPARQL 1.0, I am just mentioning it as a pointer of a trend. ) 15. Query Forms No comments. 16. Testing Values 16.3 Operator Mapping Is it worth noting the new operators in SPARQL 1.1? Operators: implemented isNUMERIC() 16.4 Operators Definitions My current state of implementation of new to SPARQL 1.1 expressions 16.4.16 IF - implemented 16.4.17 IN - implemented 16.4.18 NOT IN - implemented 16.4.19 IRI - implemented 16.4.20 URI - implemented 16.4.21 BNODE - implemented 16.4.22 STRDT - implemented 16.4.23 STRLANG - implemented No comments on the above 16.4.24 NOT EXISTS and EXISTS I am lumping these together with sub-SELECT to implement. My concern here is that the syntax gets super-complex since all the graph pattern syntax can now appear inside any expression syntax. [[There is a filter operator "exists" that ...]] Does this imply these can only appear in FILTER expressions? Please clarify. 17 Definition of SPARQL I looked at the 17.2.3 for aggregate queries and it was more helpful than the math earlier. The pseudo code in Step 4 is a bit too unclear. Is that an example implementation or the required one? 17.6 Extending SPARQL Basic Graph Matching Ignored. 18 SPARQL Grammar Clearly this is not complete; there are lots of notes to update it. 19 Conformance If property paths are not removed, please add a conformance level that includes SPARQL 1.1 without property paths. Does SPARQL 1.1 Query require implementation of the dependent specs - federated query and update? Looks to me that protocol may also be dependent? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (Darwin) iD8DBQFM/CsdQ+ySUE9xlVoRAuwIAJ9M2eMHwpyQWR+/9PdCGGKXx3elmQCfWP/g g5FTvQ9knkA04+7PkzvtIQI= =HqTf -----END PGP SIGNATURE-----
Received on Monday, 6 December 2010 00:16:00 UTC