Re: Blank Nodes and SPARQL from Ron Alford on 2005-06-28 (public-rdf-dawg-comments@w3.org from June 2005)

From: Ron Alford <ronwalf@umd.edu>
Date: Tue, 28 Jun 2005 19:00:53 -0400
To: andy.seaborne@hp.com
CC: public-rdf-dawg-comments@w3.org, Amy Alford <aloomis@glue.umd.edu>
Message-ID: <42C1D6A5.4000203@umd.edu>
Seaborne, Andy wrote:

> I understand the desire to be able to directly identify blank nodes so
> that exactly the right graph node can be found again by a subsequent
> query.  It would be very helpful in scaling RDG graphs to span machines
> but stil using the standard seriualization forms to exchange parts of
> the graph.  Exposing blank node labels helps but it is not a general
> solution we can apply to all systems.   For example, if a server
> restarts, rereading a file, are labels maintained?

This is exactly why I mentioned the extension would be session based.
Any of those conditions would either break the session or be handled by
the back end.


> As such for your extension to SPARQL I suggest that such an extension
> includes the handling of blank nodes by whatever your system uses for
> identifiers.  As an extension, it is not SPARQL - but then you were
> extending it anyway.

There is a difference between extending the protocol in a compatible
way, and extending it in a conflicting manner.

>
> Exporting the label and giving this label the characteristics for
> session based browsing or editting is very similar to assigning an
> identifing property so maybe assigning such a label is a better way to
> handle it.

If I'm aggregating data, I have no reliable way of knowing which
properties will uniquely identifying, or if they even exist. Assigning
identifying properties to each bnode seems heinous, especially since
they are often used for syntax.


> The WG postponed this - one of the reasons was because it is not clear
> that query language support is the best or only approach.  Similar to
> rdfs:member, there could be an inferred property :listMember that
> related a resource which was also a list to the list members.

This would make at least limited inferencing a required component of any
sparql implementation.  It also does not preserve order or structure of
lists.

For example, the menus of http://www.mindswap.org/ are stored as RDF
collections.
(see http://owl.mindswap.org/2003/submit-rdf/mindswap-menus.rdf)


At any rate, this is only a symptom of a larger problem - querying
complex structures without being able to directly reference an entire
class of identifiers.

Take for instance
http://protege.stanford.edu/plugins/owl/owl-library/koala.owl#MaleStudentWith3Daughters
The class is equivalent to four classes, three of which are anonymous
restrictions.

For another example of an rdf application that make heavy use of bnodes,
see Ian's proposal for OWL Rules.  Specifically, his examples section:
http://www.cs.man.ac.uk/~horrocks/DAML/Rules/#5.1


>
>>
>> == Use Cases Supported by BNode Stability ==
>>

I'm dropping the discussion of these since they clearly do not require
a conflicting extension of SPARQL to implement.  There were there to
motivate the extension of sparql with session support.  Thanks for that
illustration of UNION, though.


The main thrust of my argument boils down to a relatively simple
argument.  As it stands, we have no way other than context to work with
an bnodes.  This leads to fragile and complex queries.

The solution seems simple - drop using bnodes as syntactic sugar for a
limited form of variables, and let implementers research the best way to
deal with these problems.

I believe this issue may become even more important in future version of
the spec when update issues are considered.


- Ron Alford
Received on Tuesday, 28 June 2005 23:01:07 UTC