W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > November 2011

Re: Blank Node Ordering

From: James Leigh <james@3roundstones.com>
Date: Wed, 02 Nov 2011 08:53:15 -0400
Message-ID: <1320238395.1865.3.camel@james-laptop>
To: public-rdf-dawg-comments <public-rdf-dawg-comments@w3.org>
Thanks to everyone for taking the time to discuss this.

Continuing to make SPARQL 1.0 results valid in SPARQL 1.1 makes sense.

After speaking with the Sesame community, I believe new releases of
Sesame will include a stable ordering of blank nodes -- as per Andy's
suggestion.

James

On Wed, 2011-11-02 at 10:55 +0000, Andy Seaborne wrote:
> James,
> 
> The RDF Working Group is defining how blank nodes can be skolemized by a 
> store. This can be used (internally) as a way to order blank nodes.
> 
> In addition, SPARQL 1.1 does provide a way to group related items, blank 
> nodes included, together be using GROUP BY.
> 
> 
>    SELECT ?card ?adr ?pred ?obj {
>      ?card a vcard:VCard; vcard:adr ?adr .
>      ?adr ?pred ?obj .
>    } GROUP BY ?vcard ?adr ?pred ?obj
> 
> While this clusters RDF terms that are the same, it does not sort them.
> 
> Implementations are also free to provide extensions to "<" or ORDER BY 
> to provide placing rows with the same blank node together in the sort order.
> 
> The working group did not choose to make changes in this area of the 
> specification during the "Features and Requirements" phase and the 
> working group charter discourages any change that would alter the 
> results of a query that was valid according to SPARQL 1.0.
> 
> Therefore the working group has decided not to make a change in this area.
> 
> We would be grateful if you would acknowledge that your comment has been 
> answered by sending a reply to this mailing list.
> 
> Andy, On behalf of the SPARQL WG
> 
> 
> On 27/10/11 15:05, James Leigh wrote:
> > Hello,
> >
> > We recently ran into some unexpected behaviour that we want to bring to
> > this groups attention regarding the ORDER BY clause.
> >
> > When ordering RDF literals and URIs, the same literal or the same URI
> > will always be arranged together. However, there is no guarantee with
> > blank nodes that the same blank nodes will be arranged together.
> >
> > The following SPARQL query lists all the vcards addresses in the default
> > graph along with their properties. A single address is represented in
> > multiple result bindings, one for each property in the data store.
> >
> > SELECT ?card ?adr ?pred ?obj {
> >    ?card a vcard:VCard; vcard:adr ?adr .
> >    ?adr ?pred ?obj .
> > } ORDER BY ?vcard ?adr ?pred
> >
> > The (author's) expected result is to have all results bindings ordered
> > first by the vcard they belong to and if there are multiple addresses on
> > the vcard, each address property is ordered together.
> >
> > For example the follow bindings sets are a valid result set. Notice that
> > the entire home address comes before any of the work address properties.
> > This order is predictable because of the ORDER BY clause in the query
> > above.
> >
> > vcard=<me>, adr=<me#home>, pred=vcard:country-name, obj="Australia"
> > vcard=<me>, adr=<me#home>, pred=vcard:locality, obj="WonderCity"
> > vcard=<me>, adr=<me#home>, pred=vcard:postal-code, obj="5555"
> > vcard=<me>, adr=<me#home>, pred=vcard:street-address, obj="111 Lake
> > Drive"
> > vcard=<me>, adr=<me#work>, pred=vcard:country-name, obj="Australia"
> > vcard=<me>, adr=<me#work>, pred=vcard:locality, obj="WonderCity"
> > vcard=<me>, adr=<me#work>, pred=vcard:postal-code, obj="5555"
> > vcard=<me>, adr=<me#work>, pred=vcard:street-address, obj="33 Enterprise
> > Drive"
> >
> > However, it would be incorrect (in SPARQL 1.0 and SPARQL 1.1 draft) for
> > the author to assume the addresses will always be ordered together like
> > this.
> >
> > Consider the result set if blank nodes were used for the address node.
> > The result might look like the one below.
> >
> > vcard=<me>, adr=_:b1, pred=vcard:locality, obj="WonderCity"
> > vcard=<me>, adr=_:b1, pred=vcard:street-address, obj="111 Lake Drive"
> > vcard=<me>, adr=_:b2, pred=vcard:street-address, obj="33 Enterprise
> > Drive"
> > vcard=<me>, adr=_:b2, pred=vcard:country-name, obj="Australia"
> > vcard=<me>, adr=_:b1, pred=vcard:country-name, obj="Australia"
> > vcard=<me>, adr=_:b2, pred=vcard:postal-code, obj="5555"
> > vcard=<me>, adr=_:b1, pred=vcard:postal-code, obj="5555"
> > vcard=<me>, adr=_:b2, pred=vcard:locality, obj="WonderCity"
> >
> > Although each result of a vcard is ordered together, because it is a
> > URI, the ordering of the adr blank nodes looks random and is
> > unpredictable. Sesame 2.x is implemented to appear to randomly arrange
> > blank node results when ordering by blank nodes as shown above. When the
> > data used contains blank node there is no way to control the ordering.
> >
> > The author would expect that _:b1 is ordered before or after _:b2, but
> > the author would not expect that _:b1 is mixed among _:b2. Although,
> > there is no order between _:b1 and _:b2, SPARQL should provide guidance
> > on how to arrange blank nodes.
> >
> > Many people still use blank nodes and this issue causes unexpected
> > results for SPARQL users.
> >
> > My colleagues and I propose that the group seriously consider adding a
> > restriction to ORDER BY in SPARQL 1.1 that will ensure ordering of any
> > RDF term will guarantee that same terms are arranged together.
> >
> > Although, an order among different blank nodes could not be fixed.
> > SPARQL should fix the same RDF terms to be ordered together.
> >
> > Thanks,
> > James
> >
> >
> >
> 
Received on Wednesday, 2 November 2011 12:56:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 2 November 2011 12:56:17 GMT