Re: [BlankNodeRefs] Questions on blank node refs

Chimezie Ogbuji wrote:
> On 3/19/09 6:05 AM, "Seaborne, Andy" <andy.seaborne@hp.com> wrote:
>>> Here are my questions & concerns:
>>>
>>> 1/ implementation burden on systems that treat blank nodes as
>>> existentials -- in particular here I'm thinking of any SPARQL engine
>>> that has extended SPARQL BGP matching as per 12.6 in the spec [1] with
>>> some level of entailment that looks at blank nodes as existentials.
> 
> Perhaps an example showing this burden would help?

What I had in mind (and again, is more a question since I'm not sure) 
was whether when querying with OWL-DL semantics, is whether a query ever 
returns a binding of a projected variable to a blank node (_:a) to 
express the existence of an individual but without having named the 
individual. (I had the "little house" example in mind when I asked it, 
but let me repeat: I'm pretty dumb about OWL-DL so I may have this 
completely wrong, so I'd prefer to defer to others around here on this).

If this did happen, and the query writer came back with <_:a> (blank 
node ref) the next time around, I was wondering if that was a hardship 
on the implementation as in my point #3 (to maintain the mapping to the 
unnamed existent individual).


>>> 2/ interoperability - I question whether this is a key feature to
>>> standardize to promote interoperability. Blank node labels used via this
>>> feature will not, of course, be re-usable across implementations. I
>>> realize that an argument can be made that this promotes interoperability
>>> of applications that rely on re-using blank node refs, but I'm not
>>> thoroughly convinced that that meets my (personal) bar for the need to
>>> promote interoperability.
>> Agreed - they only make sense referring back to a data source that issued the
>> blank node in the first place.
> 
> Yes, plus the motivating use case for this feature is where subsequent
> queries are dispatched to the *same* data source.  So, I'm not sure if the
> argument for interoperability applies here (as stated).
>  
>>> 3/ implementation cost for systems that do not maintain persistent
>>> labels for blank nodes - I have in mind here things like systems that
>>> download static RDF files on the fly to query against, or federated
>>> query approaches that need to re-serialize blank nodes ids retrieved
>>> from other SPARQL endpoints to avoid label clashes. It seems like a
>>> potentially unnecessary burden to require these sorts of systems to
>>> maintain new persistent state to handle blank node refs.
>> Agreed.  We can't require reading a file twice to preserve bnode labels
>> because that's wrong.  It only applies to persistent data and even then some
>> systems only guarantee the label for the duration of a session or some other
>> system concept.
> 
> There should be some notion of 'compliance levels' (for a lack of a better
> phrase) such that systems that do not have an identification mechanism for
> bnodes (or even a system for persisting them) would not be able to provide
> such a feature and the agent dispatching the query would be informed in some
> way.

This concerns me a bit since it's a pretty significant change to how 
SPARQL has been structured so far. At the least, I'd imagine that we'd 
need to define service descriptions for the various compliance levels 
(yes, we may do service descriptions anyway).

>> One possibility here is to write a working group note that documents the usage
>> but does not make it a fully-fledged feature of SPARQL (i.e. in a REC)
> 
> The number of implementations that provide non-standard ways to address
> Bnodes is (for me) an indication of a legitimate need in the community.  It
> would be nice to standardize this capability (as a REC track feature) rather
> than to continue to have implementations do their own thing.

It's been my somewhat limited experiences that WG notes in W3C space are 
somewhat effective at drawing implementations of an "optional" 
capability together - this is mainly based on the JSON result format 
that the DAWG published as a note. I'd be interested to hear about 
counter-examples.

The benefits that I see of a REC-track feature are:

1/ (more) guaranteed existence of a capability for users of the 
technology in any particular implementation (though this goes by the 
wayside if we introduce compliance levels, I suppose)

2/ wider community review

Lee

> -- Chimezie
> 
> 
> ===================================
> 
> P Please consider the environment before printing this e-mail
> 
> Cleveland Clinic is ranked one of the top hospitals
> in America by U.S. News & World Report (2008).  
> Visit us online at http://www.clevelandclinic.org for
> a complete listing of our services, staff and
> locations.
> 
> 
> Confidentiality Note:  This message is intended for use
> only by the individual or entity to which it is addressed
> and may contain information that is privileged,
> confidential, and exempt from disclosure under applicable
> law.  If the reader of this message is not the intended
> recipient or the employee or agent responsible for
> delivering the message to the intended recipient, you are
> hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited.  If
> you have received this communication in error,  please
> contact the sender immediately and destroy the material in
> its entirety, whether electronic or hard copy.  Thank you.
> 
> 

Received on Friday, 20 March 2009 13:58:46 UTC