RE: Comments on SPARQL (based on an SPARQL Engine implementation in Python)

-------- Original Message --------
> From: Ivan Herman <>
> Date: 24 October 2004 10:49
> 
> Dear all,
> 
> I had an RDQL implementation on the top of RDFLib that included a
> number of features similar to SPARQL[1]. So I spent some time to turn
> it into a SPARQL engine. It is not a 100% complete implementation of
> SPARQL, it does not include a parser of the query language but only
the
> Python engine, and probably has bugs. Nevertheless, it is usable
> (meaning that I use it, for example:-). The description, with further
> links to the code itself are in [2]. 
> 
> There are some problems I met during the implementation that are worth
> noting here. (Although some of these problems are, undoubtedly, due to
> my misunderstanding of the draft's intention...).
> 
> --------------
> 
> Constraining Values.
> 
> The draft refers to the possibility of using application
> methods/functions as constraints. And that is good. However, the
> question arises: what bound variables the function has access to? In
my
> implementation I separated a 'per-pattern' constraint and a 'global'
> constraint. Per pattern constraint functions are invoked for one
> triplets, and get the three (bound) triplet resource references as
> arguments. 'Global' constraint functions are invoked at the end of the
> full pattern matching process, and they have access to all the binding
> (eg, in the case of Python, in the form of a dictionary of the form
> {"?x" : TheBoundResourceForX,... }.  
> 
> Clearly, the global constraint can, functionally, replace the
> per-pattern constraint, but using per-pattern constraint may make the
> implementation way more efficient (essentially, it may cut what I
> called the expansion tree early on in the process). It is also clear
> that a query language parser can separate per-pattern constraints for
> the kind of examples that are in the draft (?a < 10, etc), so it may
be
> enough if the underlying engine offers this differentiation. But a
> parser cannot cover the general user method case. The issue is whether
> we want to make that differentiation or not in SPARQL or not for this
> reason. In any case, this question *must be specified* in the
document,
> imho.   

This is an area where the working draft says very little and it is being
worked on at the moment.

Constraints, like triple patterns, are restrictions on the results that
match the query.  For any match, there is a set of bindings and
constraints evaluated on these function must be true.

A function will receive its variables as arguments (and the variables
may be unbound due to optional, say - the function will have to cope).
The actual mechanism for doing that will be implementation dependent -
executing all the constraints after the triple pattern matching would be
correct but may be slower than executing a constraints as soon as its
variables can be bound, e.g., just after the point where all varibales
used have been mentioned in patterns as per your per-pattern
constraints.  As a query engine may choose to match the triple patterns
in any order, or in groups, there is lots of scope for implementation
optimization here.

> 
> ---------------
> 
> Nested Patterns. It is not clear in the draft how 'deep' nesting can
> go. I did only the simple one, ie, only a one level depth is managed:
> 
> (?a,?b,c), {(?q,?w?,?r),(?s,t,?u)}
> 
> It is not clear whether a nesting of the kind:
> 
> (?a,?b,c), {(?q,?w?,?r),{(?s,t,?u),(?q,k,?o)}}
> 
> is also allowed or not. Actually, if it is, it has to be defined what
> it really *means*.

In that example, it is all conjunction and the same as:

(?a,?b,c) (?q,?w?,?r) (?s,t,?u) (?q,k,?o)


> (My initial thoughts are that it means an
> alternation of 'or'-s and 'and'-s, it means 
> 
> (?a,?b,c) and (?q,?w?,?r)
> or
> (?a,?b,c) and (?s,t,?u) and (?q,k,?o)
> 
> etc, recursively.) If this is adopted, it has to be described.

Noted.

> 
> -----------------
> 
> Optional Patterns. I was not clear to me *why* there might be more
than
> one optional patterns, whereas there is only *one* where pattern. Why
> the asymmetry? 

There may be more than one optional because there may be indendent
optionalk information:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox ?hpage
WHERE  ( ?x foaf:name  ?name )
       [ ( ?x foaf:mbox ?mbox ) ]
       [ ( ?x foaf:homepage ?hpage ) ]

Here, "mbox" and "hpage" get added to the results independently of each
other.

> 
> I actually wonder whether it is not better to define the combination
of
> query results in general (one could imagine the sum of two queries,
> being the concatenation of the result lists) and let the individual
> queries having a simpler structure instead. Just a thought. 
> 
> -----------------
> 
> Query patterns. Not surprisingly, this part is a bit vague (and that
is
> *no* critique on the editors, it is just the natural status of
things).
> My understanding of the CONSTRUCT * is based on an old note of Guha &
> al[3] (thanks to DanBri who drew my attention on it). Is this the
right
> interpretation? 

It is like that but not exactly - it wouldn't add extra information as
does one of the examples in the paper you reference.  It is the merge of
query pattern with the variables subituted into the pattern.  DESCRIBE
is there to allow extra server-decided information to be returned.

> 
> I found it a bit difficult to mentally bind the CONSTRUCT stuff with
> the rest of the document. It stands a bit separate from the rest. My
> abstraction in Python was, instead, that a query (select, where,
> optional, etc), returns in fact a query result object, and then
select,
> construct, etc, are just methods on that Object. I wonder whether a
> similar notion may not work better when describing the intentions. 
> 
> -----------------
> 
> Finally, the missing bits. Both in SPARQL[1] and in the requirement
> document[4] I was desperately looking for a way to manage collections
> and containers. SPARQL does *not* give me a way to ask whether '?x' is
> part of the collection 'C', or of the Seq 'S'. For handling any of
> these cases one has either to introduce some form of non-finite query
> or make some special forms for these, specifically. But none of this
is
> documented. For example, if I want to use SPARQL to query into RDF
> graph of a specific OWL ontology, and I want to find out whether a
> specific class 'C' is part of the 'unionOf' describing the class 'D',
I
> hit this problem....  

This is on the issues list (issue "accessingCollections").

> 
> -----------------
> 
> I hope these remarks are helpful

Yes thanks,

	Andy

> 
> Ivan
> 
> P.S. Disclaimer: though I am part of the W3C Team, this SPARQL
> implementation has not been done as part of a 'formal' W3C project,
ie,
> it does not reflect some sort of a W3C opinion! Rather, it is a
> spin-off of some other things I did using RDF. 
> 
> [1]http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/
> [2]http://www.ivan-herman.net/Python/sparqlDesc.html
> [3]http://www.w3.org/TandS/QL/QL98/pp/enabling.html
> [4]http://www.w3.org/TR/2004/WD-rdf-dawg-uc-20041012/
> 
> 
> 
> 
> --
> 
> Ivan Herman
> W3C Communications Team, Head of Offices
> C/o W3C Benelux Office at CWI, Kruislaan 413
> 1098SJ Amsterdam, The Netherlands
> tel: +31-20-5924163; mobile: +31-641044153;
> URL: http://www.w3.org/People/all?pictures=yes#ivan

Received on Wednesday, 3 November 2004 18:10:51 UTC