- From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
- Date: Tue, 22 May 2007 14:08:13 -0400 (EDT)
- To: public-rdf-dawg-comments@w3.org
- Cc: eric@w3.org
Comments on SPARQL Query Language for RDF W3C Working Draft 26 March 2007 Well, the document has certainly changed since the last time I reviewed it, so there is little point of going over my comments from before. This is true EXCEPT for one thing. Many of my comments from before complained about lack of rigour in the document. Unfortunately, I have noticed a continued lack of rigour in many of the basic notions and definitions underlying SPARQL. Because of problems described in 8/ below, I do not believe that the document is adequate to progress to the next stage of the W3C process, even without my fundamental disagreement with the treatment of the meaning of RDF graphs in SPARQL (3/ and 9/ below). [Note that this is not a complete review of the document. I have only looked at some of the informative material and enough of the formal definitions to see that I cannot progress further without some more information.] 1/ A question on the basic notion of RDF. >From the document: Abstract: RDF is a directed, labeled graph data format for representing information in the Web. >From RDF Concepts: Abstract: The Resource Description Framework (RDF) is a framework for representing information in the Web. How can these two different, to me, views of RDF be reconciled? If SPARQL treats RDF as simply a "graph data format" then what is the status of the RDF semantics, which goes much further? Suppose I write a system for handling RDF that respects the RDF recommendations, is this system going to be useful for SPARQL? For example, if I store RDF graphs in some internal canonical form (for example, changing "042"^^xsd:integer to "42"^^xsd:integer) then I have changed the SPARQL answers. 2/ What is a sequence? What exactly are the results of a SPARQL query? I have always thought that a sequence was inherently ordered? How then can a the "result of a query [be] a solution sequence" as well as a "result set" (Section 2.2). This is particularly glaring at the beginning of Section 9. 3/ Matching literals I was very surprised to see that the exact literal form of an RDF literal is significant (Section 2.3.3). Imagine what would happen if an SQL query depended on the exact literal form in which numbers were entered into a database! 4/ Labeled blank nodes What is a labeled blank node (Section 2.4)? Is it just a blank node, or is it something else? 5/ Syntactic shorthand and other short forms Are all the syntactic short forms simply sugar for their long forms, or is there different relationships between 1 and "1"^^xsd:integer (Section 4.1.2) and the short forms in Section 4.2. There are multiple wordings for expressing the short forms, including "the same as" (Section 4.1.2), "is equivalent to" (Section 4.1.4), and "syntactic sugar" (Section 4.2.3). 6/ Status of Section 4 Section 4 (SPARQL Syntax) is not labeled as informative. However, it does not exhaustively cover the grammar of SPARQL. For example, NumericLiteral is defined but not used in Section 4. 7/ union Why is union often written out (e.g., Section 1.2.14)? 8/ Basic Definition of SPARQL The definition of Solution Sequence is inadequately grounded. A Solution Sequence is defined as "a list of solutions, possibly unordered" (Section 12.1.6). The common formal definitions of lists depend on an ordering. If SPARQL is using some other definition, then this other definition must be at least referenced. The terminology used to refer to solutions is much to varied. It includes at least sequences, lists, unordered collections, multisets, sets. ToList "turns a multiset into a sequence, with the same elements and cardinality" (Section 12.2.3). Aside from the question about cardinality of what, this is not a functional mapping, as there are many sequences that could correspond to a multiset (or set) if the order of the sequence is ignored. The formal definition of ToList implicitly mentions this non-functionality. Given that ToList is a fundamental part of the definition of SPARQL it requires a better definition. Further, there needs to be proofs that the choice in ToList does not make a difference anywhere in SPARQL, for example, in further processing The definition of SPARQL BGP mapping importantly depends on the order that the RDF instance mapping and solution mapping are performed. This should be documented. The definition of BGP Matching is not specified in the document. The definition in Section 12.3.1 defines a "solution" reasonably, although presumably mu is *the* "restriction of P to the query variables in BGP. However, the last bit of the definition doesn't make sense? What is omega there? What is mu there? What is theta? What is mu(theta)? Where then is the definition of the match of a BGP against an RDF graph? Section 12.5 does not provide the missing glue, as it just defers to Section 12.3.1. Section 12.5 doesn't even get to a BGP and an RDF graph. What do the [ ] and { } notations mean in Section 12.4? 9/ A Fundamental Disagreement on SPARQL I still object to the fact that SPARQL can produce different results for equivalent RDF graphs, as described in Section 12.3.2. Peter F. Patel-Schneider Bell Labs Research From: "Eric Prud'hommeaux" <eric@w3.org> Subject: Re: comments on Section 1 and Section 2 of SPARQL Query Language for RDF Date: Thu, 17 May 2007 17:43:34 -0700 > The Data Access Working Group is ready to bring SPARQL Query to > Candidate Recommendation. The objections posted by Peter F. > Patel-Schneider pertain to parts of the language that have changed > since the last CR transition. We hope PFPS will agree to the language > changes, withdraw his objection, and help us with editorial updates > during the Candidate Recommendation phase. > > Dear Peter, > > It has been 15 months since your comments, and we have reorganized the > document substantially, hopefully in ways that address your comments. > (Please see section 12 to see the aggregated definitions and note that > section 2 is now informative.) I have responded to many of your > comments with "[gone]". Others are marked with "[definitions > replaced]". These annotations are sprinkled throught this reply with > the goal of responding to each comment. > > I have drafted text to address your editorial comments and will > propose it to the working group after the transition to CR. None of > these changes affect the semantics of the query language as understood > by the working group. > > There have been some changes to the entailment regime in the past > year. Your technical comments (both numbered C2.39) should be > addressed by the new semantics. If you wish to persue either the > editorial or technical comments, we should split out the thread as > the distinction is important to the W3C publication process.
Received on Tuesday, 22 May 2007 18:08:06 UTC