- From: Enrico Franconi <franconi@inf.unibz.it>
- Date: Thu, 20 Oct 2005 10:04:17 +0200
- To: David Wood <dwood@softwarememetics.com>
- Cc: public-swbp-wg@w3.org
On 14 Oct 2005, at 20:31, David Wood wrote: > 1) SPARQL would appear not to have a complete model theoretic > base. Although sections of the specification are described using > sets, nothing is presented which hangs all of the section > together. This is unfortunate and is, I think, the underlying > cause of some of language features critiqued below. (NB: I tried > to get Simon Raboczi to complete and submit his model theory > unifying iTQL and SPARQL, but was unsuccessful in doing so.) The WG is working on that. We have a real MT for SPARQL. > 2) The language is not built with extensibility in mind. That is, > is not, in my opinion, sufficiently functional. There are several > areas of functionality which we already know are of interest to > users of RDF query languages (e.g. iTQL's 'walk' and 'trans' > functions which perform generic graph walking and transitive > closure, respectfully) and it is difficult to see how one might add > these commands to a later SPARQL version without making wholesale > changes to the language. I believe that once SPARQL will have a serious compositional MT, extensibility will come for free. > 3) The handling of blank nodes ("bnodes") is, again in my opinion, > the single greatest failure of the specification. We have to admit > that RDF graphs contain bnodes and queries will run across them. > We also have to admit that a querier will (not 'may') often want to > subsequently find information connected to those bnodes. SPARQL's > insistence that bnodes' true internal identities not be returned to > a querier (correct in and of itself) combined with the lack of > subquery capability ensures that many useful RDF queries routinely > performed in other languages simply cannot be written in SPARQL. > OPTIONAL addresses only part of that functionality. Again, I believe that once SPARQL will have a serious compositional MT, these issues will be solved. Btw, bnodes are the only new feature of SPARQL wrt standard query languages. > 4) SPARQL contains a large number of top-level commands. This > could be a result again of the lack of subqueries and an underlying > model theory. It is an unfortunate design choice. See above. > 5) Good language design would dictate that logical opposites (e.g. > conjunction and disjunction) be represented in syntactically > similar ways, even if one is generally implicit. Therefore, since > UNION (disjunction) is present (as well it should be) along with > conjunction, then conjunction should have an optional equivalent > keyword even though it is implicit in most uses. The 'join' (i.e., the 'dot') is the conjunction, exactly like in SQL. > 7) The nulls generated by UNION and the nulls generated by > OPTIONAL may be distinct. They correspond to logical true and > false, respectively. That makes life a bit difficult for > implementors. It may be that another form of 'null' should be > considered. Thanks to Simon Raboczi for this analysis. This was pointed out in the WG as well some time ago. The new MT fixes that. > 8) There does not seem to be any way to force a literal into the > variable position in a binding. That is very useful when > attempting to create a result which must take a certain form (e.g. > be a set of triples) and occasionally mandatory if the result set > must be forced into triple form. I'm not sure I understand you well, but in the CONSTRUCT you can have of course literals. > 2) The form and content of DESCRIBE results are left to the data > publisher. It would seem that such an open-ended conversation > would require a human consumer in the general case. I am left > wondering why the DESCRIBE functionality is not left to a more > general SELECT query against a describing RDF container. I agree on that. > SCALABILITY CONCERNS: > > 1) The evaluation of regular expressions after the binding of > graph patterns rules out a lot of potential join optimizations. The semantics does not say that you have to implement it *after*; it is just a definition. As soon as they comply to the overall semantics, any optimisation is fine. > 2) OPTIONAL, in its entirety. The concern is that querying very > large data sets using OPTIONAL would result in very large > intermediate results requiring joining. Subqueries effectively > sidestep that problem by allowing further restrictions against a > smaller result set. Like any algebraic operator with a well defined semantics, OPTIONAL can be implemented in different ways, leaving all the space you need for optimisations. cheers --e. Enrico Franconi - franconi@inf.unibz.it Free University of Bozen-Bolzano - http://www.inf.unibz.it/~franconi/ Faculty of Computer Science - Phone: (+39) 0471-016-120 I-39100 Bozen-Bolzano BZ, Italy - Fax: (+39) 0471-016-129
Received on Thursday, 20 October 2005 08:04:23 UTC