- From: Lee Feigenbaum <feigenbl@us.ibm.com>
- Date: Tue, 15 Aug 2006 00:38:53 -0400
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
This is an early review of the reorganization of the SPARQL Query Language for RDF specification known as rq24. I've divided the review into comments on the overall structure and presentation of the document, specific editorial comments on content in the document, and layout/rendering nits. (Admittedly, some of the distinctions are a bit arbitrary.) I have not attempted to review rq24 with respect to substantive issues currently facing the working group, or as to the correctness of the formal definitions. I have also not yet reviewed section 11 Testing Values or the appendices. In this note I present the comments on the overall structure and presentation of the document. The other comments will follow in separate notes. Structural and Presentation: + Grammar rules. I'm wondering if grammar rules excerpted throughout the document should be every rule having to do with the topic or just a few select rules that illustrate the relevant constructs. For example, in section 3.1.1 Syntax for IRIs, the grammar rules included define a <...> IRI ref and a QName. They don't, however, define the SPARQL PREFIX clause or BASE clause, both of which are discussed in that section. (Another example is 3.1.4 in which the rules for "[]" are included but not the rules for "[:p :o]".) + "1.2 Document Outline" is currently before "1.1 Document Conventions". I think this is the proper order of the two topics and that only the numbering need be fixed. + 1.1.3 Result Descriptions. I think it would be good to tie the tabular representation directly to some formal part of the spec. (Perhaps by noting that a row in the table represents one solution from a solution sequence, or perhaps indirectly by noting that the table is a visual representation of the XML results form.) + 2.2 Multiple Matches. I don't think we've seen blank node syntax yet to this point in the specification. + 2.2 Multiple Matches. "This is a basic graph pattern match, and all the variables used in the query pattern must be bound in every solution." At the least, this should link to a formal definition of basic graph pattern. At the most, this sentence should be removed as being overly technical for the primer section. + 2 Making Simple Queries. If this section is intended to be a small primer, I think it needs to be more comprehensive. It should include introductory queries that use UNION, OPTIONAL, BOUND, and perhaps GRAPH. It may also be the only place in the SPARQL document in which it would be reasonable to include the OPTIONAL/!BOUND trick for querying maximum/minimum values. (An example of this trick might also be appropriate in section 7.3 or 7.5.) + 2.7 Blank Nodes in Query Results. With talk about the scoping set and co-occurrences of blank nodes, this section does not belong in Section 2 of the larger document. A stripped down section might be appropriate, but I think it would be better off in Section 10, Query Result Forms. + 3.2 Syntax for Triple Patterns. This section links to http://www.w3.org/2001/sw/DataAccess/rq23/rq24.html#syntaxMisc for abbreviations, but that internal anchor doesn't seem to exist. + 3.2 Syntax for Triple Patterns. The entire introduction to this section seems to be superfluous in light of the information and examples regarding PREFIX and BASE and IRI references in 3.1.1 Syntax for IRIs. Also, I don't see any reason to use the "$" variant to variable tokens in these examples. I'd strike the entire introduction (everything before 3.2.1). + 4 Initial Definitions. I like the positioning of this section, but some of the terms defined here (in particular RDF Term and maybe Query Variable) are used in previous sections. Perhaps a forward reference from somewhere near the beginning of 3.1 RDF Term would be appropriate. + 4.1 RDF Terms. I think that each definition here should be its own subsection. That is: 4.1 RDF Terms 4.2 Query Variable (needs one introductory sentence as in "SPARQL semantics bind query variables to RDF Terms." 4.3 Graph Pattern (needs one introductory sentence as in "SPARQL queries are made of one or more graph patterns." 4.4 SPARQL Query (needs one introductory sentence as in "Formally, a SPARQL query contains four components:" Then 4.2 Triple Patterns becomes 4.5 Triple Patterns. However, I think Triple Patterns makes more sence after Query Variable and before Graph Pattern. + 4.3 Pattern Solutions. This section ends with: """ @@ Consider whether to have a "RDF dataset" section in "Initial Definitions" Graph patterns match against the default graph of an RDF dataset, except for the RDF Dataset Graph Pattern. In this section, all matching is described for a single graph, being the default graph of the RDF dataset being queried. """ I think that an RDF dataset definition here would be appropriate. I do not understand what the rest of the text there is doing at this point in the document. + 4.5 Matching Values and RDF D-entailment. This does not belong in the Initial Definitions section. I'd prefer to see it as a subsection of 5.1 General Framework or of 5.2 SPARQL Basic Graph Pattern Matching. + 5 Basic Graph Patterns. It's unclear to me why the first three definitions here are not part of 5.1 General Framework. + 5 Basic Graph Patterns. I'd like it if the definition of a BGP was tied in some way to the grammar which parses as large a BGP as possible when it encounters the first triple pattern in a BGP. (That is, some text which clarified that there is only a single BGP in { :x :p :q . :y :r :s . } + 5.3 Examples of Basic Graph Pattern Matching. As it stands currently, this section is barely more than the example queries from section 2. I think that this section is important, but I think that it should take this example and work through the E-entailment (simple-entailment, actually) based definitions in detail to show how one arrives at the expected solutions for the query. I'd be glad to try writing this text up, if it would be helpful. + 6.3 Unbound variables. Should be removed as per the @@. It no longer belongs here, and it is sufficiently covered by the definitions of variable subtitutions and pattern solutions in section 4. + 7.4 Optional Matching - Formal Definition. I think this section should be the first subsection of section 7. Also, the text about left-associativty seems to belong more in this section then in the expository text which currently makes up 7.1 Optional Pattern Matching. + 8.2 Union Matching - Formal Definition. As above, I think this section should be the first subsection of section 8. + 9.2.2 Specifying Named Graphs and 9.2.3 Combining FROM and FROM NAMED. The example given here makes its pointby using the GRAPH keyword which has not yet been introduced. Two possible fixes: 1) A non-query-based example here which simply shows a set of FROM NAMED clauses and then shows a representation of the RDF Dataset created from those clauses. (9.3.* shows plenty of examples of queries with the GRAPH keyword). 2) Move the formal definition of GRAPH to early in this section (right after the RDF dataset formal definition), which would make this example more reasonable. + 10.2, 10.3, 10.4, 10.5. As above, I'd put the formal definitions first in these sections, and follow that with the expository text. Lee
Received on Tuesday, 15 August 2006 04:39:12 UTC