- From: Ivan Herman <ivan@w3.org>
- Date: Sun, 24 Oct 2004 11:49:25 +0200
- To: public-rdf-dawg-comments@w3.org
- Cc: W3C Semantic Web Team List <sw-team@w3.org>, Daniel Krech <eikeon@eikeon.com>
- Message-ID: <417B7AA5.2090005@w3.org>
Dear all,
I had an RDQL implementation on the top of RDFLib that included a number of features
similar to SPARQL[1]. So I spent some time to turn it into a SPARQL engine. It is not a
100% complete implementation of SPARQL, it does not include a parser of the query language
but only the Python engine, and probably has bugs. Nevertheless, it is usable (meaning
that I use it, for example:-). The description, with further links to the code itself are
in [2].
There are some problems I met during the implementation that are worth noting here.
(Although some of these problems are, undoubtedly, due to my misunderstanding of the
draft's intention...).
--------------
Constraining Values.
The draft refers to the possibility of using application methods/functions as constraints.
And that is good. However, the question arises: what bound variables the function has
access to? In my implementation I separated a 'per-pattern' constraint and a 'global'
constraint. Per pattern constraint functions are invoked for one triplets, and get the
three (bound) triplet resource references as arguments. 'Global' constraint functions are
invoked at the end of the full pattern matching process, and they have access to all the
binding (eg, in the case of Python, in the form of a dictionary of the form {"?x" :
TheBoundResourceForX,... }.
Clearly, the global constraint can, functionally, replace the per-pattern constraint, but
using per-pattern constraint may make the implementation way more efficient (essentially,
it may cut what I called the expansion tree early on in the process). It is also clear
that a query language parser can separate per-pattern constraints for the kind of examples
that are in the draft (?a < 10, etc), so it may be enough if the underlying engine offers
this differentiation. But a parser cannot cover the general user method case. The issue is
whether we want to make that differentiation or not in SPARQL or not for this reason. In
any case, this question *must be specified* in the document, imho.
---------------
Nested Patterns. It is not clear in the draft how 'deep' nesting can go. I did only the
simple one, ie, only a one level depth is managed:
(?a,?b,c), {(?q,?w?,?r),(?s,t,?u)}
It is not clear whether a nesting of the kind:
(?a,?b,c), {(?q,?w?,?r),{(?s,t,?u),(?q,k,?o)}}
is also allowed or not. Actually, if it is, it has to be defined what it really *means*.
(My initial thoughts are that it means an alternation of 'or'-s and 'and'-s, it means
(?a,?b,c) and (?q,?w?,?r)
or
(?a,?b,c) and (?s,t,?u) and (?q,k,?o)
etc, recursively.) If this is adopted, it has to be described.
-----------------
Optional Patterns. I was not clear to me *why* there might be more than one optional
patterns, whereas there is only *one* where pattern. Why the asymmetry?
I actually wonder whether it is not better to define the combination of query results in
general (one could imagine the sum of two queries, being the concatenation of the result
lists) and let the individual queries having a simpler structure instead. Just a thought.
-----------------
Query patterns. Not surprisingly, this part is a bit vague (and that is *no* critique on
the editors, it is just the natural status of things). My understanding of the CONSTRUCT *
is based on an old note of Guha & al[3] (thanks to DanBri who drew my attention on it). Is
this the right interpretation?
I found it a bit difficult to mentally bind the CONSTRUCT stuff with the rest of the
document. It stands a bit separate from the rest. My abstraction in Python was, instead,
that a query (select, where, optional, etc), returns in fact a query result object, and
then select, construct, etc, are just methods on that Object. I wonder whether a similar
notion may not work better when describing the intentions.
-----------------
Finally, the missing bits. Both in SPARQL[1] and in the requirement document[4] I was
desperately looking for a way to manage collections and containers. SPARQL does *not* give
me a way to ask whether '?x' is part of the collection 'C', or of the Seq 'S'. For
handling any of these cases one has either to introduce some form of non-finite query or
make some special forms for these, specifically. But none of this is documented. For
example, if I want to use SPARQL to query into RDF graph of a specific OWL ontology, and I
want to find out whether a specific class 'C' is part of the 'unionOf' describing the
class 'D', I hit this problem....
-----------------
I hope these remarks are helpful
Ivan
P.S. Disclaimer: though I am part of the W3C Team, this SPARQL implementation has not been
done as part of a 'formal' W3C project, ie, it does not reflect some sort of a W3C
opinion! Rather, it is a spin-off of some other things I did using RDF.
[1]http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/
[2]http://www.ivan-herman.net/Python/sparqlDesc.html
[3]http://www.w3.org/TandS/QL/QL98/pp/enabling.html
[4]http://www.w3.org/TR/2004/WD-rdf-dawg-uc-20041012/
--
Ivan Herman
W3C Communications Team, Head of Offices
C/o W3C Benelux Office at CWI, Kruislaan 413
1098SJ Amsterdam, The Netherlands
tel: +31-20-5924163; mobile: +31-641044153;
URL: http://www.w3.org/People/all?pictures=yes#ivan
Received on Monday, 25 October 2004 05:09:11 UTC