Fwd: Blank Node Ordering

Hi all,

FYI.  This is a real-world use case worth considering as we discuss blank nodes.  Although it is mostly a SPARQL issue, I felt this group should be aware of the discussion.

Regards,
Dave




Begin forwarded message:

> From: James Leigh <james@3roundstones.com>
> Subject: Blank Node Ordering
> Date: October 27, 2011 10:05:30 EDT
> To: public-rdf-dawg-comments@w3.org
> Cc: David Wood <david@3roundstones.com>
> 
> Hello,
> 
> We recently ran into some unexpected behaviour that we want to bring to
> this groups attention regarding the ORDER BY clause.
> 
> When ordering RDF literals and URIs, the same literal or the same URI
> will always be arranged together. However, there is no guarantee with
> blank nodes that the same blank nodes will be arranged together.
> 
> The following SPARQL query lists all the vcards addresses in the default
> graph along with their properties. A single address is represented in
> multiple result bindings, one for each property in the data store.
> 
> SELECT ?card ?adr ?pred ?obj {
>  ?card a vcard:VCard; vcard:adr ?adr .
>  ?adr ?pred ?obj .
> } ORDER BY ?vcard ?adr ?pred
> 
> The (author's) expected result is to have all results bindings ordered
> first by the vcard they belong to and if there are multiple addresses on
> the vcard, each address property is ordered together.
> 
> For example the follow bindings sets are a valid result set. Notice that
> the entire home address comes before any of the work address properties.
> This order is predictable because of the ORDER BY clause in the query
> above.
> 
> vcard=<me>, adr=<me#home>, pred=vcard:country-name, obj="Australia"
> vcard=<me>, adr=<me#home>, pred=vcard:locality, obj="WonderCity"
> vcard=<me>, adr=<me#home>, pred=vcard:postal-code, obj="5555"
> vcard=<me>, adr=<me#home>, pred=vcard:street-address, obj="111 Lake
> Drive"
> vcard=<me>, adr=<me#work>, pred=vcard:country-name, obj="Australia"
> vcard=<me>, adr=<me#work>, pred=vcard:locality, obj="WonderCity"
> vcard=<me>, adr=<me#work>, pred=vcard:postal-code, obj="5555"
> vcard=<me>, adr=<me#work>, pred=vcard:street-address, obj="33 Enterprise
> Drive"
> 
> However, it would be incorrect (in SPARQL 1.0 and SPARQL 1.1 draft) for
> the author to assume the addresses will always be ordered together like
> this.
> 
> Consider the result set if blank nodes were used for the address node.
> The result might look like the one below.
> 
> vcard=<me>, adr=_:b1, pred=vcard:locality, obj="WonderCity"
> vcard=<me>, adr=_:b1, pred=vcard:street-address, obj="111 Lake Drive"
> vcard=<me>, adr=_:b2, pred=vcard:street-address, obj="33 Enterprise
> Drive"
> vcard=<me>, adr=_:b2, pred=vcard:country-name, obj="Australia"
> vcard=<me>, adr=_:b1, pred=vcard:country-name, obj="Australia"
> vcard=<me>, adr=_:b2, pred=vcard:postal-code, obj="5555"
> vcard=<me>, adr=_:b1, pred=vcard:postal-code, obj="5555"
> vcard=<me>, adr=_:b2, pred=vcard:locality, obj="WonderCity"
> 
> Although each result of a vcard is ordered together, because it is a
> URI, the ordering of the adr blank nodes looks random and is
> unpredictable. Sesame 2.x is implemented to appear to randomly arrange
> blank node results when ordering by blank nodes as shown above. When the
> data used contains blank node there is no way to control the ordering.
> 
> The author would expect that _:b1 is ordered before or after _:b2, but
> the author would not expect that _:b1 is mixed among _:b2. Although,
> there is no order between _:b1 and _:b2, SPARQL should provide guidance
> on how to arrange blank nodes.
> 
> Many people still use blank nodes and this issue causes unexpected
> results for SPARQL users.
> 
> My colleagues and I propose that the group seriously consider adding a
> restriction to ORDER BY in SPARQL 1.1 that will ensure ordering of any
> RDF term will guarantee that same terms are arranged together.
> 
> Although, an order among different blank nodes could not be fixed.
> SPARQL should fix the same RDF terms to be ordered together.
> 
> Thanks,
> James
> 

Received on Thursday, 27 October 2011 21:40:01 UTC