- From: Rob Shearer <Rob.Shearer@networkinference.com>
- Date: Tue, 11 May 2004 10:33:01 -0700
- To: "RDF Data Access Working Group" <public-rdf-dawg@w3.org>
- Cc: "Pat Hayes" <phayes@ihmc.us>
There has been some discussion of so-called "may-bind" variables, and this discussion has been distilled into candidate requirement 3.6. My understanding all along was that it was Pay Hayes' implementation experience which was driving this, but it became clear to me during today's teleconference that what he has implemented and what this requirement addresses are two very very different things. Several logic-based query languages have the notion of "distinguished" and "undistinguished" variables. Distinguished variables are those for which a binding to a concrete name in the underlying data set must be established, while undistinguished variables don't necessarily need a concrete binding. Undistinguished variables, however, still must be known to exist: their relationships with the distinguished variables are still predicates which need to be fulfilled. My understanding of the DQL implementation is that "may-bind" variables simply allow the system to return a concrete binding for such an undistinguished variable if it happens to come across one. In plain RDF the details are quite straightforward: "undistinguished" variables may be matched against bnodes or regular nodes, while "distinguished" variables may only match against regular nodes. A "may-bind" variable would be a variable which can match against either bnodes or regular nodes, but which returns a binding in the result set if the match is against a regular node. Note that in this case the existence of bnodes very much affects the results of a query. The way requirement 3.6 is written suggests that one may encode parts of the query which may be completely ignored if necessary--they don't impose any constraints at all. I read that requirement as the ability to say "get me all the people, and their email addresses if they have one, but it's all right if they don't have one". That is quite contrary to my understanding of how Pat's implementation works, and I'm curious whether there is any implementation experience for such a feature. The feature effectively amounts to saying "P AND (Q OR true)", with the condition that you're not allowed to optimize Q away despite it being irrelevent. The implications of this feature for both the query language and implementations of that language are unclear and a bit worrying.
Received on Tuesday, 11 May 2004 13:34:48 UTC