Re: ISSUE-77: Should we mark rdf:Seq as archaic (cf ISSUE-24) from Andy Seaborne on 2011-10-15 (public-rdf-wg@w3.org from October 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Sat, 15 Oct 2011 19:35:08 +0100
To: public-rdf-wg@w3.org
Message-ID: <4E99D25C.4030701@epimorphics.com>
On 15/10/11 19:09, Jeremy Carroll wrote:
>
> I think both the Seq and the List constructs present technical issues.
> Basically it is because both present the possibility of 'bad' data and
> no clarity about what one should do in the face of it.

+1

> We can easily form ill-formed lists with rdf:first or rdf:rest either
> missing or multiple.
> We can easily form ill-formed sequences with duplicate or missing rdf:_2

although Seq are very fragile and lists are merely fragile.  The 
duplicate rdf:_2 by merging is really nasty.

> The consumer of such ill-formed data is in a bind
> And what's worse is that formally the ill-formed data is not ill-formed,
> it is just triples.
>
> We could label both with a health warning ...

Sandro said that:
> I think Turtle makes RDF Collections seem quite
> nice, and hopefully that will quickly set the tone (perhaps with a
> little help from us) for APIs and SPARQL 1.2 (?) having nice list
> handling functions that are as efficient as native (non-destructive)
> list handling functions. (I hope some APIs do this already.)

and the point about Turtle syntax, and the convenience of writing, is 
important.

Jena has container and collection APIs to make working with containers, 
collections easier but the details leak out if you can take a triple view.

Ivan:
> But it is a bit of a problem that SPARQL 1.1 still does not cover list handling fully:-(

SPARQL 1.2 will not solve anything I'm afraid.  SPARQL 1.1 Query has 
gone as far as it can, except maybe a little extra syntactic sugar with

{ ?list rdf:rest*/rdf:first ?member }

It's much better than handling Seqs.

SPARQL Update can manuipuate lists but it's ugly:
http://lists.w3.org/Archives/Public/public-rdf-dawg/2011JanMar/0389.html


The fundamental problem in SPARQL is that any order is lost; so this 
list access works for some cases, where the order does not matter.

Even if a special order preserving construct were available, order is 
lost in the rest of the query.  An order-presering QL would not be 
SPARQL 1.2 - it would be have completely different basis, (e.g. no 
chance of implementing use hash joins), would be very hard to have 
parallel implementation (see "big data" graph languages), and still does 
not work when two ordered different subresults need combining.

Fundamentally, there are two problems:

1/ Encoding in triples
2/ Lists aren't the only datastructure.

Reification, containers and collections encode data structure in triples 
but if the app can see "triples" then this leaks through to the 
application.  It also means there can be the possibility of 'bad' data 
as Jeremy says.  Seeing the triples is confusing at best.

The structure we have may not say what you want:
   List(1 2 3) != List(3 2 1)
but if a list is being used to express an unordered collection, a higher 
level convention has to be communicated.

I think the only complete solution will involve putting structural 
literals into RDF itself, so they are not triple-encoded and can't be 
'bad'.  When treated as first-class literals with equality rules, 
accessors, and combining rules, then implementations can store them 
specially, provide good APIs, and application programmer won't have to 
learn about the encoding rules.

 Andy
Received on Saturday, 15 October 2011 18:35:39 UTC