- From: Enrico Franconi <franconi@inf.unibz.it>
- Date: Thu, 20 Oct 2005 10:04:17 +0200
- To: David Wood <dwood@softwarememetics.com>
- Cc: public-swbp-wg@w3.org
On 14 Oct 2005, at 20:31, David Wood wrote:
> 1) SPARQL would appear not to have a complete model theoretic
> base. Although sections of the specification are described using
> sets, nothing is presented which hangs all of the section
> together. This is unfortunate and is, I think, the underlying
> cause of some of language features critiqued below. (NB: I tried
> to get Simon Raboczi to complete and submit his model theory
> unifying iTQL and SPARQL, but was unsuccessful in doing so.)
The WG is working on that. We have a real MT for SPARQL.
> 2) The language is not built with extensibility in mind. That is,
> is not, in my opinion, sufficiently functional. There are several
> areas of functionality which we already know are of interest to
> users of RDF query languages (e.g. iTQL's 'walk' and 'trans'
> functions which perform generic graph walking and transitive
> closure, respectfully) and it is difficult to see how one might add
> these commands to a later SPARQL version without making wholesale
> changes to the language.
I believe that once SPARQL will have a serious compositional MT,
extensibility will come for free.
> 3) The handling of blank nodes ("bnodes") is, again in my opinion,
> the single greatest failure of the specification. We have to admit
> that RDF graphs contain bnodes and queries will run across them.
> We also have to admit that a querier will (not 'may') often want to
> subsequently find information connected to those bnodes. SPARQL's
> insistence that bnodes' true internal identities not be returned to
> a querier (correct in and of itself) combined with the lack of
> subquery capability ensures that many useful RDF queries routinely
> performed in other languages simply cannot be written in SPARQL.
> OPTIONAL addresses only part of that functionality.
Again, I believe that once SPARQL will have a serious compositional
MT, these issues will be solved. Btw, bnodes are the only new feature
of SPARQL wrt standard query languages.
> 4) SPARQL contains a large number of top-level commands. This
> could be a result again of the lack of subqueries and an underlying
> model theory. It is an unfortunate design choice.
See above.
> 5) Good language design would dictate that logical opposites (e.g.
> conjunction and disjunction) be represented in syntactically
> similar ways, even if one is generally implicit. Therefore, since
> UNION (disjunction) is present (as well it should be) along with
> conjunction, then conjunction should have an optional equivalent
> keyword even though it is implicit in most uses.
The 'join' (i.e., the 'dot') is the conjunction, exactly like in SQL.
> 7) The nulls generated by UNION and the nulls generated by
> OPTIONAL may be distinct. They correspond to logical true and
> false, respectively. That makes life a bit difficult for
> implementors. It may be that another form of 'null' should be
> considered. Thanks to Simon Raboczi for this analysis.
This was pointed out in the WG as well some time ago. The new MT
fixes that.
> 8) There does not seem to be any way to force a literal into the
> variable position in a binding. That is very useful when
> attempting to create a result which must take a certain form (e.g.
> be a set of triples) and occasionally mandatory if the result set
> must be forced into triple form.
I'm not sure I understand you well, but in the CONSTRUCT you can have
of course literals.
> 2) The form and content of DESCRIBE results are left to the data
> publisher. It would seem that such an open-ended conversation
> would require a human consumer in the general case. I am left
> wondering why the DESCRIBE functionality is not left to a more
> general SELECT query against a describing RDF container.
I agree on that.
> SCALABILITY CONCERNS:
>
> 1) The evaluation of regular expressions after the binding of
> graph patterns rules out a lot of potential join optimizations.
The semantics does not say that you have to implement it *after*; it
is just a definition. As soon as they comply to the overall
semantics, any optimisation is fine.
> 2) OPTIONAL, in its entirety. The concern is that querying very
> large data sets using OPTIONAL would result in very large
> intermediate results requiring joining. Subqueries effectively
> sidestep that problem by allowing further restrictions against a
> smaller result set.
Like any algebraic operator with a well defined semantics, OPTIONAL
can be implemented in different ways, leaving all the space you need
for optimisations.
cheers
--e.
Enrico Franconi - franconi@inf.unibz.it
Free University of Bozen-Bolzano - http://www.inf.unibz.it/~franconi/
Faculty of Computer Science - Phone: (+39) 0471-016-120
I-39100 Bozen-Bolzano BZ, Italy - Fax: (+39) 0471-016-129
Received on Thursday, 20 October 2005 08:04:23 UTC