Re: str() function should also accept blank node argument from David Booth on 2011-10-18 (public-rdf-dawg-comments@w3.org from October 2011)

From: David Booth <david@dbooth.org>
Date: Tue, 18 Oct 2011 17:08:27 -0400
To: public-rdf-dawg-comments@w3.org
Message-ID: <1318972107.2178.51729.camel@dbooth-laptop>
Andy,

That solution works only for fixed values of :foo and :bar.  And moving
the INSERT into the same statement as the DELETE won't work either,
because that would cause duplicate bnodes to be reinserted:

> PREFIX : <http://example/>
> 
> DELETE { :x :p ?x .
>           ?x :foo ?a ; :bar ?b . }
> INSERT DATA { :x :p [ :foo ?a ; :bar  ?b ] }
> WHERE { :x :p ?x .
>          ?x :foo ?a ;
>             :bar ?b .
>          FILTER( isBlank(?x) )
>        }
> 

But this is only one use case that I happened to run into.  I'm sure
there are others.  IMO, as a general principle of language design,
*every* value that can be generated should be able to be turned into a
string and compared.  

I agree that it would be ideal if the str() function would return a
string that is unique to the store rather than merely being unique to
the solution set, so please amend my request in this way.  :)  I had
suggested making it unique to the solution set only because I was
concerned that there might be resistance to store-wide values for
implementations that already return bnode serializations that are only
unique within a solution set.

The RDF-WG's progress on skolomization is welcome complementary
progress, but this inability to apply the str() function to a bnode is
still a gap in SPARQL that should be addressed.  If it cannot be
addressed now (due to time constraints) I think it should go on an
issues list to be addressed in the next version.

Thanks,
David


> From: Andy Seaborne <andy.seaborne@epimorphics.com>
> Date: Fri, 07 Oct 2011 09:49:08 +0100
> Message-ID: <4E8EBD04.1040603@epimorphics.com>
> To: public-rdf-dawg-comments@w3.org
> 
> David,
> 
> The following should carry out that task - delete everything and put one 
> instance back.  Proceed by a query to check if it's needed.
> 
> ----------------------
> PREFIX : <http://example/>
> 
> DELETE { :x :p ?x .
>           ?x :foo "a" ; :bar "b" . }
> WHERE { :x :p ?x .
>          ?x :foo "a" ;
>             :bar "b" .
>          FILTER( isBlank(?x) )
>        }
> 
> INSERT DATA { :x :p [ :foo "a" ; :bar  "b" ] }
> ----------------------
> 
> Most of the results formats (XML, JSON, TSV) already have a per-result 
> set bNode identifier but it is per-resultset and not the identifier used 
> by the store to identify the bNode internally to the store (if there is 
> a stable one at all).
> 
> Changing str() would only work if it were the store-wide identifier AND 
> that identifier could be used in a later operation.  Some systems do 
> provide this as a non-standard extension and have done since SPARQL 1.0.
> 
> The RDF-WG has published a revised RDF Abstract Syntax document which is 
> relevant here as it describes bNode skolemization:
> 
> http://www.w3.org/TR/rdf11-concepts/#section-blank-nodes
> 
>  Andy
> 
> On 05/10/11 05:26, David Booth wrote:
> > http://www.w3.org/TR/sparql11-query/#func-str
> > The str() function is currently defined to accept only a literal or an
> > IRI  -- not a bnode.  Is there a compelling reason why not?  I think
> > this is a problem because AFAICT it is currently impossible to impose an
> > ordering on bnodes, since the "<" operator also does not compare IRIs or
> > bnodes.  I think the str() function should also accept a bnode argument
> > and return a lexical form of that bnode that is unique to the result
> > set.  This would allow one to write a query that distinguishes the
> > "first" bnode from others (where the choice of first is arbitrary, but
> > holds within the query), thus allowing that query to (for example)
> > delete all "non-first" bnode objects.
> >
> > For example, consider the following RDF:
> >
> >    :x :p [ :foo "a" ; :bar "b" ] .
> >
> > When it is read in repeatedly to a graph, each time it is read in an
> > additional bnode will be created, thus leading to multiple (redundant)
> > statements in the graph, each having a *different* bnode:
> >
> >    :x :p _:b1 .      _:b1 :foo "a" ; :bar "b" .
> >    :x :p _:b2 .      _:b2 :foo "a" ; :bar "b" .
> >    :x :p _:b3 .      _:b3 :foo "a" ; :bar "b" .
> >      . . .
> >
> > I.e., the graph becomes increasingly non-lean.  It would be nice to be
> > able to write a simple DELETE query to get rid of these extra bnode
> > objects, but without the ability to impose an arbitrary ordering on them
> > in the query, I do not see an easy way to distinguish one bnode object
> > from all the others and delete only "all the others".  In the case where
> > there is a known, fixed number of predicates in the bnode object then I
> > could use a nested SELECT DISTINCT, but I want to be able to write the
> > query more generally, where the set of predicates is not fixed (though
> > the depth of the bnode object would be fixed, to avoid getting into the
> > general problem of making graphs lean).
> >
> > There are probably other use cases that would also benefit from the
> > ability to impose an ordering on bnodes -- just as it is useful to be
> > able to impose an ordering on other entities -- but this is the case
> > that happened to motivate me to send in this suggestion.
> >
> > I assume that this suggestion is too late for consideration in SPARQL
> > 1.1, but if it could be added to a wish list for future consideration I
> > would appreciate it.
> >
> > Thanks!
> >
> 
-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Tuesday, 18 October 2011 21:08:57 UTC