- From: David Wood <david@3roundstones.com>
- Date: Thu, 27 Oct 2011 17:39:29 -0400
- To: public-rdf-wg WG <public-rdf-wg@w3.org>
- Cc: James Leigh <james@3roundstones.com>
- Message-Id: <B860CD44-93EF-44E5-8284-A55EB7376887@3roundstones.com>
Hi all, FYI. This is a real-world use case worth considering as we discuss blank nodes. Although it is mostly a SPARQL issue, I felt this group should be aware of the discussion. Regards, Dave Begin forwarded message: > From: James Leigh <james@3roundstones.com> > Subject: Blank Node Ordering > Date: October 27, 2011 10:05:30 EDT > To: public-rdf-dawg-comments@w3.org > Cc: David Wood <david@3roundstones.com> > > Hello, > > We recently ran into some unexpected behaviour that we want to bring to > this groups attention regarding the ORDER BY clause. > > When ordering RDF literals and URIs, the same literal or the same URI > will always be arranged together. However, there is no guarantee with > blank nodes that the same blank nodes will be arranged together. > > The following SPARQL query lists all the vcards addresses in the default > graph along with their properties. A single address is represented in > multiple result bindings, one for each property in the data store. > > SELECT ?card ?adr ?pred ?obj { > ?card a vcard:VCard; vcard:adr ?adr . > ?adr ?pred ?obj . > } ORDER BY ?vcard ?adr ?pred > > The (author's) expected result is to have all results bindings ordered > first by the vcard they belong to and if there are multiple addresses on > the vcard, each address property is ordered together. > > For example the follow bindings sets are a valid result set. Notice that > the entire home address comes before any of the work address properties. > This order is predictable because of the ORDER BY clause in the query > above. > > vcard=<me>, adr=<me#home>, pred=vcard:country-name, obj="Australia" > vcard=<me>, adr=<me#home>, pred=vcard:locality, obj="WonderCity" > vcard=<me>, adr=<me#home>, pred=vcard:postal-code, obj="5555" > vcard=<me>, adr=<me#home>, pred=vcard:street-address, obj="111 Lake > Drive" > vcard=<me>, adr=<me#work>, pred=vcard:country-name, obj="Australia" > vcard=<me>, adr=<me#work>, pred=vcard:locality, obj="WonderCity" > vcard=<me>, adr=<me#work>, pred=vcard:postal-code, obj="5555" > vcard=<me>, adr=<me#work>, pred=vcard:street-address, obj="33 Enterprise > Drive" > > However, it would be incorrect (in SPARQL 1.0 and SPARQL 1.1 draft) for > the author to assume the addresses will always be ordered together like > this. > > Consider the result set if blank nodes were used for the address node. > The result might look like the one below. > > vcard=<me>, adr=_:b1, pred=vcard:locality, obj="WonderCity" > vcard=<me>, adr=_:b1, pred=vcard:street-address, obj="111 Lake Drive" > vcard=<me>, adr=_:b2, pred=vcard:street-address, obj="33 Enterprise > Drive" > vcard=<me>, adr=_:b2, pred=vcard:country-name, obj="Australia" > vcard=<me>, adr=_:b1, pred=vcard:country-name, obj="Australia" > vcard=<me>, adr=_:b2, pred=vcard:postal-code, obj="5555" > vcard=<me>, adr=_:b1, pred=vcard:postal-code, obj="5555" > vcard=<me>, adr=_:b2, pred=vcard:locality, obj="WonderCity" > > Although each result of a vcard is ordered together, because it is a > URI, the ordering of the adr blank nodes looks random and is > unpredictable. Sesame 2.x is implemented to appear to randomly arrange > blank node results when ordering by blank nodes as shown above. When the > data used contains blank node there is no way to control the ordering. > > The author would expect that _:b1 is ordered before or after _:b2, but > the author would not expect that _:b1 is mixed among _:b2. Although, > there is no order between _:b1 and _:b2, SPARQL should provide guidance > on how to arrange blank nodes. > > Many people still use blank nodes and this issue causes unexpected > results for SPARQL users. > > My colleagues and I propose that the group seriously consider adding a > restriction to ORDER BY in SPARQL 1.1 that will ensure ordering of any > RDF term will guarantee that same terms are arranged together. > > Although, an order among different blank nodes could not be fixed. > SPARQL should fix the same RDF terms to be ordered together. > > Thanks, > James >
Received on Thursday, 27 October 2011 21:40:01 UTC