RE: Coments on first working draft of SPARQL from Seaborne, Andy on 2004-10-25 (public-rdf-dawg-comments@w3.org from October 2004)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 25 Oct 2004 13:44:59 +0100
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>, <public-rdf-dawg-comments@w3.org>
Message-ID: <8D5B24B83C6A2E4B9E7EE5FA82627DC94D2B6B@sdcexcea01.emea.cpqcorp.net>
Peter,

Thank you very much for the comments:

Changes where mentioned are started in v1.121.  As a wokring draft
document, there will be quite a few changes to come.  Some of the
matters arising can't be completely finished until other documents are
ready.

	Andy

-------- Original Message --------
> From: Peter F. Patel-Schneider <>
> Date: 13 October 2004 17:09
> 
> I took a quick look at
> 
>     SPARQL RDF query language
>     http://www.w3.org/TR/rdf-sparql-query/
> 
> For a first working draft it is quite good.
> 
> Nevertheless, I have a number of things that I think need bringing up.
> 
> 
> First, a few nits:
> 
> - Are query variables disjoint from RDF Terms?  It looks as if they
>   should be.

Yes - they are distinct.  

[[
Definition: Query Variable

Let V be the set of all query variables.  V and RDF-T are disjoint.
]]

where RDF-T is URIrefs, bNodes and literals (we couldn't find a single
piece of terminology in "RDF concepts" to cover all things that go to
make up an RDF graph - we choose "RDF term" to cover this).

> 
> - The use of "bound" in the discussion of Optional Values is rather
>   jarring.  You probably don't mean bound, but instead mean something
>   like mentioned.

Agreed, thanks - the "bound" language is a too procedural.  The use of
"mentioned" is better for variable ocurrence in a query and "unset" when
referring to the variable (non-)occurrence in the results.

> 
> - Why is there an AND keyword?  Wouldn't it just be possible to
>   intersperse triple patterns and constraints?  If the AND keyword is
>   needed, how tightly does it bind to the rest of the query.  For
>   example, does a constraint in an OPTIONAL portion constrain all
> results? 

It is possible to intersperse triple patterns and constraints.  The AND
keyword may turn out to be unnecessary - it is conjunction.

Depending on the final syntax, it may be needed, or just make parsing
and comprehensibility easier, where there is a change from triple
patterns to a constraint in more mathematical style syntax.

> 
> - What happens if an OPTIONAL block has multiple matches?  I assume
that
>   multiple bindings will result.

Yes - multiple pattern solutions will be produced.  I will add an
example that shows this for the next working draft.

> 
> 
> Now for some more substantive issues:
> 
> 
> SPARQL allows bnodes in triple patterns and in constraints.  This
leads
> to a number of thorny issues.
> 
> How are blank nodes handled in constraints?  For example, what does
> 	_:a < 30  (where _:a is a blank node)
> evaluate to?

The exact evaluation will depend on the constraint function but in this
example it would evaluate to an error and hence lead to the rejection of
potental solutions where a bNode is compared.

The working draft had very little in this area and the editors version
has added some material, especially the use of (a subset of) the
Xquery/Xpath functions and operators.

I have also 

> 
> How are blank nodes handled in triple patterns?  For example, does the
> triple pattern
> 	( ?x ex:r _:v )
> match the RDF graph
> 	ex:a ex:r _:a .
> 	ex:a ex:r _:b .

Your comments suggest that a section devoted to the details around
bNodes would be helpful.  This has been started in the editors working
draft.

The query syntax does not allow bNodes in queries. bNodes can not be put
in query requests and that needs to be explained somewhere.

> 
> In general, what is the status of blank nodes in SPARQL?  For example,
> which definition of subgraph does SPARQL use - the standard one from
> graph 
> theory or the expansive one used in RDF semantics in the presence of
> bnode 
> relabelling?
> 
> Even if bnodes do not appear in a query, how are multiple matches that
> differ only with respect to bnodes handled?
> 
> 
> Theses issues are all a consequence of the following issue:
> 
> SPARQL appears to depend on an unsanctioned extension of RDF, namely
> that 
> bnodes in an RDF graph have identity that can be taken out of the
graph
> and 
> transmitted elsewhere.  Is this the case?  If so, how is this
extension
> going to work?  If not, how can bnodes be handled reasonably in
SPARQL?

In the case where results are serialized, XML or RDF/XML forms, there
are merely labels (c.f. like bNodes ids in RDF/XML) that are document
scoped. The working group is currently actively desiging the result
serialization formats.  They only enable one bNode in a serialized
result form to be distinguished from another in the same serialized
result.  They can not be used to get back to the original bNode in the
graph - that would have to be done by reusing a graph pattern that found
it.

In the local case (no serialization of results, directly working with
the graph), the query processor can be working directly with the graph
and can return programming language objects that the graph
implementation uses for bNodes.  bNodes do not leave the graph; the
programming system has whatever mechanisms it uses to pass references
around just like literals and URIs in RDF APIs.

	Andy

> 
> 
> 
> Peter F. Patel-Schneider
> Bell Labs Research
Received on Monday, 25 October 2004 12:46:03 UTC