- From: Holger Knublauch <holger@topquadrant.com>
- Date: Mon, 13 Aug 2012 09:57:30 +1000
- To: public-rdf-dawg-comments@w3.org
Dear WG,
as someone who has supported the inclusion of BIND into SPARQL 1.1,
please allow me to provide some feedback. Overall we (TopQuadrant, Inc)
are happy that BIND has been added, and we and our customers use it a lot.
However, I believe the semantics of BIND need some tweaking, because the
current design is unnecessarily restrictive and counter intuitive. In
terms of the textual syntax, the main problem that should be
reconsidered is the fact that variables from preceding { ... } blocks
are not visible in BIND statements, e.g.
GRAPH <...> {
?x rdfs:label ?label .
}
BIND (my:function(?label) AS ?str) .
does not work as expected, because ?label is not bound to the value from
the inner graph block. While the example above is artificial to
illustrate the syntactic issue, we have many practical use cases where
this is a real-world problem. Here are some simplified examples to
illustrate the issues:
1) Redundant. It becomes hard to reuse BIND sequences:
{
?x rdfs:label ?label .
}
UNION
{
?x skos:prefLabel ?label .
}
BIND (my:stringOperation1(?label) AS ?str) .
BIND (my:stringOperation2(?str) AS ?str2) .
is currently invalid and would need to be changed to
{
?x rdfs:label ?label .
BIND (my:stringOperation1(?label) AS ?str) .
BIND (my:stringOperation2(?str) AS ?str2) .
}
UNION
{
?x skos:prefLabel ?label .
BIND (my:stringOperation1(?label) AS ?str) .
BIND (my:stringOperation2(?str) AS ?str2) .
}
2) Inefficient. We very often need to perform sequences of BINDs
intermixed with FILTERs, e.g.
BIND (ex:firstStep(?x) AS ?a) .
FILTER bound(?a) .
BIND (ex:secondStep(?a) AS ?b) .
FILTER ?b > 10 .
BIND (ex:thirdStep(?b) AS ?c) .
The problem with the above is that SPARQL engines may (or even should)
move the FILTERs to the end, producing effectively
BIND (ex:firstStep(?x) AS ?a) .
BIND (ex:secondStep(?a) AS ?b) .
BIND (ex:thirdStep(?b) AS ?c) .
FILTER bound(?a) .
FILTER ?b > 10 .
However, this is not desirable because the BIND operations may be
complex operations by themselves and we certainly don't want them to
execute unnecessarily, or even with very unexpected input values. So the
trick we worked around this used to be to group FILTERs and BINDs
tightly together, so that execution stops as early as possible, e.g.
{
{
BIND (ex:firstStep(?x) AS ?a) .
FILTER bound(?a) .
}
BIND (ex:secondStep(?a) AS ?b) .
FILTER (?b > 10) .
}
BIND (ex:thirdStep(?b) AS ?c) .
The above pattern unfortunately doesn't work with the current SPARQL 1.1
spec.
3) Non-intuitive and inconsistent. In general, I do like the mantra that
SPARQL is executed from the inside out, so that in general variables
bound in inner blocks can be used in surrounding blocks. This is how
BGPs, FILTERs etc work. So why does BIND not follow the same principle?
This is hard to explain to end users. I certainly don't understand the
reasons for this inconsistency, and I don't think I am a SPARQL beginner.
Sorry to raise this problem so late in the process, but we have only
become aware of the issue after a very recent bugfix in the SPARQL API
that we are using, and the "bug" that was there before was masking the
behavior and was just working fine for us. In fact we have successfully
used the syntactic patterns from above for many years for as long as the
"bug" was present in the API.
Regards,
Holger
Received on Sunday, 12 August 2012 23:58:12 UTC