- From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
- Date: Fri, 11 Feb 2011 12:36:03 +0000
- To: SPARQL Working Group <public-rdf-dawg@w3.org>, Andy Seaborne <andy.seaborne@epimorphics.com>, Steve Harris <steve.harris@garlik.com>
Hi Andy, Steve, others, here is my review for SPARQL Query 1.1 (apart from Sections 11 and 18), although I mention some typos that I noticed while skimming over Section 18 and I have a general comment for one subsection too. First the things that are more substantial in my opinion: In general, I was a bit confused about what simple literals are. Is that the same as plain literals? The spec uses "RDF literals", which as I understand it can be any type of literal, and then "plain literal" and "simple literal". Is plain the same as simple? Can the notation either be unified or if these are different, can there be a definition of what is what? Another big concern is whether the implementation of property paths are optional or not. So far only the evaluation of BGPs (Bgp(...) in the algebra) is used to actually compute bindings. All other operations then work on solution sequences, which is really nice. Some property path features require, however, an algebra extension that introduces other forms of computing solutions, e.g., the evaluation of :s :p+ ?o yields a solution sequence by an extended form of BGP matching, but this is not defined for entailment regimes and cannot be defined by means of entailment. Thus, queries with certain property paths cannot make use of the BGP matching extension point, which introduces in my opinion and unfortunate incompatibility, which is not mentioned or discussed in the document. I am not against property paths, but I think this incompatibility has to be mentioned and I would also like to see property path being an optional feature, i.e., a SPARQL 1.1 conformant system can but does not have to support property paths. Similarly, I am concerned about FILTER EXISTS and FILTER NOT EXISTS. MINUS is a proper algebra operator, which combines solutions sequences, but having EXISTS and NOT EXISTS defined as filters is not in line with how filters previously worked, i.e., by working on RDF terms that result from applying given solutions to the variables followed by evaluating the filter expression (see also description in the itemise of 17.2). FILTER EXISTS and FILTER NOT EXISTS require the evaluation of BGPs, so that is quite a different thing and would require an algebra translation that somewhere has a Bgp(...) element in it, which does not seem to be the case at the moment. This might just be the case because Section 18 is not yet ready, but even though FILTER [NOT] EXISTS uses the FILTER keyword, it might be better to not treat them in the same way as other filters or not even use the FILTER keyword at all. I would rather like to see them as first class operators like MINUS or OPTIONAL. Here are the mostly minor things: Status , The new features are: ... The link for "Expressions in the SELECT clause" is not working 1.1 Document Outline ... Sections 11 incorproated <- incorp*or*ated 1.2.1 Namespaces Entry for snf: http://www.w3.org/ns/sparql# has @@(process) Ensure page populated The page is actually there and has an entry for #bound. Is more needed? If not remove @@... 2.5 Creating Values with Expressions Why is the SELECT clause in the query example indented? 4.1.1.1 Prefixed *N*ames (for consistency) 4.1.4 Syntax for Blank Nodes "Blank nodes in graph patterns act as non-distinguished variables, not as references to specific blank nodes in the data being queried." I suggest to just remove non-distinguished as it is confusing and not really what SPARQL does. 5.1.2 Extending Basic Graph Pattern Matching "SPARQL is defined for matching RDF graphs with simple entailment. SPARQL can be extended to other forms of entailment given certain conditions as described below." This is the only place where simple entailment is mentioned and it might come out of context here. I also think it would be good to reference the entailment regimes document here. How about: "SPARQL evaluates basic graph patterns using subgraph matching, which can be defined using simple entailment. SPARQL can be extended to other forms of entailment given certain conditions as described below and <a href="http://www.w3.org/TR/sparql11-entailment/">SPARQL 1.1 Entailment Regimes</a> do this for several entailment relations." 9.1 Property Path Expressions In this section URI is used, but shouldn't it be IRI? In the forth row (negated property set) I find !^uri not explained by the Matches column. Should it just be !uri? I can figure out what it is supposed to match, but the Matches text only explains the second pattern. Text below the table: in a negated property sets <- in a negated property *set* Binary operators / and ^: There is no example of ^ as a binary operator in the table as far as I can see. Is the binary usage meant to be as in Example: Inverse Path Sequence below? First example (Example: Alternatives), I first couldn't figure out what is subject, what is predicate and what is object. I suggest no spaces around | and/or brackets. 9.3 Cycles and Duplicates First example data should be: :x :p :z1 . :x :p :z2 . :z1 *:q* :y . :z2 *:q* :y . Second example data should be: :x :p *:y* . *:y* :p :x . 10 Assignment Second sentence: "The new variable must not have been used in the query up to that point. " Is that true? Later in 19.8 Grammar, Notes: 11. The variable assigned in a BIND clause must not be already in-scope. 10.1 BIND: Assigning to *V*ariables (for consistency) The BIND form allows *a* value (not a*n* value) "Use of BIND is a separate element of a group graph pattern and it ends any basic graph pattern, including ending the scope of any filters." I find "including ending the scope of any filters." confusing. In the example below, the filter applies to the whole group (as usual), but does this note mean that if the filter were positioned before the BIND, then it would just apply to the elements before? For example, would: { ?x ns:price ?p . ?x ns:discount ?discount FILTER(?p < 20) BIND (?p*(1-?discount) AS ?price) ?x dc:title ?title . } be translated somehow such that Filter((?p < 20), Bgp(?x ns:price ?p . ?x ns:discount ?discount)) is then extended according to the BIND part and then joined with the last triple pattern? 10.2 BINDINGS ...o send a more *constrained* query to a remote query service. (not constrainded) 12 Subqueries, after the data for the first example: "Return a name (the one with the lowest sort order) from all the people that know Alice and have a name." Isn't the query rather asking for people that Alice knows? How about: "Return a name (the one with the lowest sort order) for all the people that Alice knows and who have a name." at the end of the example: Subqueries require one additional algebra operator, ToMultiset, which takes *l*ists and returns *m*ultisets. I don't see a reason to put list and multiset in upper case since these terms do not refer to any function in this context. at the end of the section: Only variables projected by the Project function are visible to operations outside the ToMultiset call. <code>ToMultiset</code> (for consistency) 16.1.2 SELECT *E*xpressions 16.2.4 CONSTRUCT WHERE (no FILTERs and *no* complex graph patterns are allowed in the short form) 16.4.2 Identifying Resources The property foaf:mbox is defined as being an inverse function*al* property in the FOAF vocabulary. 17 Expressions and Testing Values still has: @@(editorial) Expressions, not just testing values "SPARQL FILTERs restrict the solutions of a graph pattern match according to a given expression. " expression is linked to the expression grammar element, but in fact filters are followed by the grammar element condition. 17.2 Filter Evaluation First itemise: "Apart from BOUND, all functions and operators operate on RDF Terms and will produce a type error if any arguments are unbound." This seems not true if FILTER EXISTS and FILTER NOT EXISTS are indeed realised as filters because they require BGP matching, so don't just operate on RDF terms. 17.2.2 Effective Boolean Value (EBV) "... The following rules reflect the rules for fn:boolean applied to the argument types present in SPARQL Queries:" I don't see why Queries in SPARQL Queries is upper case. 17.3 Operator Mapping "This table is not up to date. IN, NOT IN, BNODE, IF, COLAESCE, IRI, URI, STRDT, STRLANG, NOT EXISTS, EXISTS" This should be updated for LC (also COALESCE, not COLAESCE). 17.4 Operator and Function Definitions Still has: @@URIs: sfn:bound etc. @@Clean prototypes. This should be updated for LC. 17.4.2.7 datatype "Returns the datatype IRI of typedLit; returns xsd:string if the parameter is a simple literal." Here in particular I am confused about what simple literals are. In OWL 2, "abc" has datatype rdf:PlainLiteral and not xsd:string as I understand it. Would "abc" in SPARQL be a simple literal and have datatype xsd:string? What is then a plain literal in SPARQL? Is there any syntactic difference? 17.4.2.8 IRI still has: @@ Do we also need IRI(relStr, baseStrOrIRI)? Should be fixed for LC. 17.4.2.11 STRDT What happens if a given lexical form is invalid, e.g., STRDT("abc", xsd:integer)? Does that result in an error or in "abc"^^xsd:integer? For other functions there is an explicit note about errors and it would be good to have such a note also for STRDT. I know Section 18 is not ready for review yet, but here are just some typos: 18.2.4 Converting Groups, Aggregates and SELECT *E*xpressions 18.2.2.3 Translate Basic Graph Patterns and Filters After translating property paths, any adjacent triple patterns are *collected* (not colelctied) together to form a basic graph pattern *BGP(triples)* (not BGP(triples>). and one other comment: 18.6 Extending SPARQL Basic Graph Matching I am not quite happy with the text (in particular the formulation of the conditions) since it is not at all well-aligned with the notation used in the rest of the document, e.g., "answer set" is everywhere else "solution sequence" and in this case answer set is even a set of pattern instance mappings, which is not the case anywhere else, where a BGP evaluates into a multiset of solution mappings and the RDF instance mappings just determine the multiplicity. We (Markus Krötzsch and I) discuss what is wrong with the conditions in an ISWC paper and I am happy to suggest a more aligned version of the conditions, if he WG is interested in this. -- Dr. Birte Glimm, Room 309 Computing Laboratory Parks Road Oxford OX1 3QD United Kingdom +44 (0)1865 283520
Received on Friday, 11 February 2011 12:36:37 UTC