Re: Can you query rdf:List easily? (WAS Re: update on vCard edits and The Compromise)

Benjamin Nowack wrote:
> On 27.07.2007 23:47:33, Garret Wilson wrote:
>> Harry Halpin wrote:
>>> Is it indeed difficult or impossible to write a SPARQL query over
>>> rdf:List, particulary with the use of owl:sameAs as in:
> It's impossible to write a general SPARQL query to retrieve all
> entries of an unknown rdf:List. The use of owl:sameAs doesn't really
> have an additional effect on the complexity, though.
> 
>>> Can someone tell me precisely why and if so, does rdf:Seq help?
> Containers don't form a deep path which has to be traversed. They
> are just, well, containers. Members are at the same level/depth
> which allows retrieving them with a single query pattern. I'll
> re-paste my earlier query. It works for any number/length of 
> container-based "additional-names":
> 
> [[
>    SELECT ?pos ?name WHERE {
>        <#card4711> vcard:additional-names ?seq .
>        ?seq       ?pos ?name .
>    }
>    ORDER BY ?pos .
> ]]
> 
> This is impossible for rdf:List. 

"impossible" is a dangerous thing to say :-)


ARQ supports a property list:member, which is like rdfs:member: it's the 
relationship between the list and it's elements.

PREFIX list: <http://jena.hpl.hp.com/ARQ/list#>
SELECT ?member
{
   ?list list:member ?member
}

It so happens that the list elements come out in order.  Cycles and acyclic 
lists are detected and all elements returned.

If some level of rules is running underneath the SPARQL engibe, there would be 
  no need for making it part of the engine.


To get strict order, without relying on the order from the engine, the app 
needs to get the index and the element.

SELECT ?member
{
   ?list list:member (?index ?member)
}
ORDER BY ?index

The index will be somewhat arbitrary if the list is not a strict linear 
rdf:rest chain.


Both rdf:List and rdf:Seq do assume a certain degree of well-formedness in the 
datastructure.  For lists, no (a)cycles; for rdf:Seq, no duplicate properties 
like rdf:_2 and no additional properties on the Seq.  This latter one is what 
breaks "?seq ?pos ?name" - need to check that ?pos looks like "^rdf:_" to be 
robust.

ARQ also support rdfs:member directly as well which does these tests.  It is 
better done in the rules underneath but that is too expensive just for this 
one feature.


Overall:

This is the fundamental problem of encoding a datastructure in RDF - there is 
some higher level convention being applied on top of the core RDF.  If the 
convention is broken, there can be chaos.

The popularity of RDF lists presumably comes from the syntax support in 
N3/Turtle.

	Andy

 > You need a different query for each
> number of names, e.g. (assuming lists built w/o "parseType=Collection",
> but with an explicit first+literal):
> 
> 2 names:
> [[
>    SELECT ?name1 ?name2 WHERE {
>        <#card4711> vcard:additional-names ?list1 .
>        ?list1      rdf:first ?name1 .
>        ?list1      rdf:rest ?list2 .
>        ?list2      rdf:first ?name2 .
>    }
> ]]
> 
> 3 names:
> [[
>    SELECT ?name1 ?name2 ?name3 WHERE {
>        <#card4711> vcard:additional-names ?list1 .
>        ?list1      rdf:first ?name1 .
>        ?list1      rdf:rest ?list2 .
>        ?list2      rdf:first ?name2 .
>        ?list2      rdf:rest ?list3 .
>        ?list3      rdf:first ?name3 .
>    }
> ]]
> etc.
> 
> It *is* possible to construct a query that would work for,
> say 1-5 additional names, but that would be a really ugly 
> beast with nested OPTIONALs (which aren't supported by all
> RDF stores and are slower than querying a simple container
> that uses a single blank node instead of one for each branch).
> The more practical approach is probably to iteratively
> extend the query until you hit an rdf:nil or no additional
> rdf:rest.
> 
> 
> Cheers,
> Benjamin
> 
> --
> Benjamin Nowack
> http://bnode.org/
> 
> 

-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Monday, 30 July 2007 08:51:25 UTC