- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Fri, 11 Nov 2005 15:06:24 +0000
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Response to: http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Nov/0012.html which is http://lists.w3.org/Archives/Public/public-swbp-wg/2005Oct/0107.html Outstanding comments 2 and 3: 2 is about told bNodes 3 is about transitive properties Andy -------- Original Message -------- Subject: SPARQL Comments (Personal) Resent-Date: Sat, 05 Nov 2005 12:27:28 +0000 Resent-From: public-rdf-dawg-comments@w3.org Date: Sat, 5 Nov 2005 07:27:19 -0500 From: David Wood <dwood@softwarememetics.com> To: public-rdf-dawg-comments@w3.org CC: public-swbp-wg@w3.org Hi all, I have made some comments on the SPARQL language at [1]. A brief discussion thread ensued. Please note that DanC has already drawn my attention ([2]) to the isLiteral filter and the state of UNSAID. These comments should be taken as PERSONAL and do NOT represent any W3C working group. [1] http://lists.w3.org/Archives/Public/public-swbp-wg/2005Oct/0107.html [2] http://lists.w3.org/Archives/Public/public-rdf-dawg/2005OctDec/ 0162.html Regards, Dave -------- Original Message -------- > From: > http://lists.w3.org/Archives/Public/public-swbp-wg/2005Oct/0107.html > > Hi all, > > I have an action item [1] to review and comment on the specification > for the SPARQL Query Language for RDF [2]. > > I reviewed the 21 July 2005 Working Draft. > > Regards, > Dave > > [1] http://www.w3.org/2005/10/03-swbp-minutes.html#action17 > [2] http://www.w3.org/TR/rdf-sparql-query/ > -----------------------------% > <-------------------------------------------- > > Review of SPARQL Query Language for RDF Working Draft > > OVERVIEW: > > > LANGUAGE ISSUES: > > 1) SPARQL would appear not to have a complete model theoretic base. > Although sections of the specification are described using sets, > nothing is presented which hangs all of the section together. This is > unfortunate and is, I think, the underlying cause of some of language > features critiqued below. (NB: I tried to get Simon Raboczi to > complete and submit his model theory unifying iTQL and SPARQL, but was > unsuccessful in doing so.) > > 2) The language is not built with extensibility in mind. That is, is > not, in my opinion, sufficiently functional. There are several areas > of functionality which we already know are of interest to users of RDF > query languages (e.g. iTQL's 'walk' and 'trans' functions which perform > generic graph walking and transitive closure, > respectfully) and it is difficult to see how one might add these > commands to a later SPARQL version without making wholesale changes to > the language. I understand this comment may be submitted formally by SWBPD > > 3) The handling of blank nodes ("bnodes") is, again in my opinion, the > single greatest failure of the specification. We have to admit that > RDF graphs contain bnodes and queries will run across them. We also > have to admit that a querier will (not 'may') often want to > subsequently find information connected to those bnodes. SPARQL's > insistence that bnodes' true internal identities not be returned to a > querier (correct in and of itself) combined with the lack of subquery > capability ensures that many useful RDF queries routinely performed in > other languages simply cannot be written in SPARQL. OPTIONAL addresses > only part of that functionality. I understand this comment may be submitted formally by SWBPD see also part of http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Jun/0039.html http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Jul/0006.html and: http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Jul/0010.html > 4) SPARQL contains a large number of top-level commands. This could > be a result again of the lack of subqueries and an underlying model > theory. It is an unfortunate design choice. The working group documented the use case and requirments in http://www.w3.org/TR/rdf-dawg-uc/ Assuming "top level" command referred to the 4 query forms SELECT, CONSTRUCT, DESCRIBE, ASK. CONSTRUCT and DESCRIBE returns RDF graphs (requirement 3.4) http://www.w3.org/TR/rdf-dawg-uc/#r3.4 CONSTRUCT takes a template (c.f. projection variables) DESCRIBE enables a client to inspect the graph returned so can be given information without a priori knowledge of structure. SELECT returns a result set (streamed), with limits http://www.w3.org/TR/rdf-dawg-uc/#r3.2 ASK returns a single boolean value. http://www.w3.org/TR/rdf-dawg-uc/#d4.9 > 5) Good language design would dictate that logical opposites (e.g. > conjunction and disjunction) be represented in syntactically similar > ways, even if one is generally implicit. Therefore, since UNION > (disjunction) is present (as well it should be) along with conjunction, > then conjunction should have an optional equivalent keyword even though > it is implicit in most uses. The working group decided that the conjunctive triple patterns should be modelled on Turtle/N3 syntax to reuse a potetially familar concept. This does have "." as a syntactic element. > 6) I was initially concerned about the definitions of subjects as > including literals in triple patterns, until Andy Seaborne pointed out > the differences between a triple and a triple matching pattern. > > 7) The nulls generated by UNION and the nulls generated by OPTIONAL > may be distinct. They correspond to logical true and false, > respectively. That makes life a bit difficult for implementors. It > may be that another form of 'null' should be considered. Thanks to > Simon Raboczi for this analysis. > > 8) There does not seem to be any way to force a literal into the > variable position in a binding. That is very useful when attempting to > create a result which must take a certain form (e.g. be a set of > triples) and occasionally mandatory if the result set must be forced > into triple form. There is a FILTER builtin "isLiteral" that tests for whether an RDF term is a literal or not. > INTEROPERABILITY ISSUES: > > 1) There is, to the best of my knowledge, no way for a SPARQL user to > command the creation of an RDF container within a data store. The lack > of an explicit command will encourage implementors to create their own, > thereby hurting interoperability. Similarly for deletion requests. The working group was chartered for data access and graph update is out-of-scope. http://www.w3.org/2003/12/swa/dawg-charter#update This is interpreted as covering graph creation as well. I hope that another working group will be charted to address this when implementation deployment experience on the web indicates ways of doing that. > 2) The form and content of DESCRIBE results are left to the data > publisher. It would seem that such an open-ended conversation would > require a human consumer in the general case. I am left wondering why > the DESCRIBE functionality is not left to a more general SELECT query > against a describing RDF container. The result of DESCRIBE is an RDF graph, not a result set. The client can inspect that graph (just as if it had read a whole graph with HTTP GET) to detemine the information provided. > SCALABILITY CONCERNS: > > 1) The evaluation of regular expressions after the binding of graph > patterns rules out a lot of potential join optimizations. The document covers what a client can assume of a query processor implementing SPARQL. It does not prescribe how that is implemented. In particular, systems already reorder queries to insert filter expressions (including regular expressions) at the best point and also to push them into the graph matching process. > 2) OPTIONAL, in its entirety. The concern is that querying very large > data sets using OPTIONAL would result in very large intermediate > results requiring joining. Subqueries effectively sidestep that > problem by allowing further restrictions against a smaller result set. I note that "subquery" might be referring to Kowari's use of Relation-Valued Attributes [DateCJ ed 8, Introduction to Database Systems p152-153,590]. Also described as nested tables. The report [1] discusses some implementation possibilties, in particular using left outer join to implement OPTIONAL. [1] http://www.hpl.hp.com/techreports/2005/HPL-2005-170.html As we proceed thorugh CR, implementation experience will grow but existing prototypes are already showing efficient query (e.g. 3Store) [If it was referring to SQL subqueries: The equivalent of SQL subqueries in the form of intermediate tables are present in SPARQL through the arbitrary composition of graph patterns into larger graph patternms. SPARQL does not provide aggregate operators akin to SQL's IN/ANY/SOME/ALL.] > 3) UNSAID would have been of great concern with regards to > scalability, just in case it comes back :) OPTIONAL and BOUND can be used to achieve the effect in many cases. The use of bound is not actually needed as it's functional can also be achieved in roundabout ways (e.g. FILTER ( ( ?x = 3 ) || ! (?x = 3 ) ) http://lists.w3.org/Archives/Public/public-rdf-dawg/2005OctDec/0162 > SPECIFICATION ISSUES: > > 1) It would be really nice if the grammar was ordered for easier > reading, perhaps alphabetically. The grammar is ordered roughly top-down, from the parser entry level "query" production to tokens. All terms are hyperlinked to their definition if they are no inline tokens. Andy
Received on Friday, 11 November 2005 15:06:46 UTC