Re: RDF* (an approach for reification) - implementation in rdf-ruby ? from Gregg Kellogg on 2014-06-24 (public-rdf-ruby@w3.org from June 2014)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Tue, 24 Jun 2014 16:22:56 -0700
To: peter@vandenabeele.com
Cc: "public-rdf-ruby@w3.org" <public-rdf-ruby@w3.org>
Message-Id: <19B874E9-3B46-4664-A663-0538D92283DD@greggkellogg.net>
On Jun 21, 2014, at 1:19 PM, Peter Vandenabeele <peter@vandenabeele.com> wrote:

> Hi,

Hi Peter,

> With great interest I was reading this description of RDF* [0], [1], [2]
> (Foundations of an Alternative Approach to Reification in RDF)
> 
> To me, this seems a potentially efficient solution to 3 features that
> I was missing in my current view on the RDF model and that I was
> describing earlier in my ideas about Dbd [3]:
> 
> * possibility for fine grained provenance
>     (on statement level, not graph level)
> 
> * possibility for strict ordering of facts/statements
>     (allowing windowed caching in a log structured system)
> 
> * possibility for statement identity
>     (allowing deprecation/soft delete/hard delete of older statements)

This is interesting stuff, but IMO, should be considered syntactic sugar for creating an rdf:Statement in either Turtle/TriG or SPARQL, rather than doing a full extension of RDF, which I don't see as necessary. In this case, I think that the extension could be more like the following:

[10] subject ::= iri | BlankNode | collection | tripleX
[10x] subject ::= iri | BlankNode | collection
[12] object  ::= iri | BlankNode | collection | blankNodePropertyList | literal | tripleX
[12x] object  ::= iri | BlankNode | collection | blankNodePropertyList | literal
[30x] tripleX ::= '<<' subjectX predicate objectX '>>'

Obviously pretty similar to what they suggested, but for "collection". The result of this production would be a BNode, or in SPARQL, an existential variable (which is how a BNode is interpreted). Otherwise, I don't think there are any real issues in actual processor- or query-execution.

> I have 2 questions:
> 
> 1) do others see the RDF* proposal as a proper solution for these questions ?

It may involve repetition, if the statement would otherwise be part of a predicateObjectList, but it seems like a useful thing to try. After the work on Turtle/TriG was finalized, there was some thought about coming back to look at some follow-on to TriG which would normalize some usage within TriG making it a little closer to N3 [4].

> 2) if so, would it be relevant to work on a potential implementation in rdf-ruby ?

Doing this on a branch of rdf-turtle and sparql gems would be useful, as a working proof-of-concept is important in sanctioning standards work.

> If I understand correctly, a possible (naive) implementation could go along:
> 
> * upon storage, assign a unique internal/hidden id to each Statement
> (similar to object id (oid) in postgresql; this could also be the id or uuid of a blank Node [A] associated with each statement ?)
> 
> * also allow a reference to the oid of another statement as the value for
> subject and object (next to IRI's, blank Nodes, Literals), or even simply
> reusing the Blank Node option with the id or uuid of the blank Node created
> in [A] above ?

Looking at the Turtle parser, I think it would be treated similarly to blankNodePropertyList or collection; If you note in [5], this enters the production by allocating a BNode (as :subject) and after processing the blankNodePropertyList, assigns the result to either :subject or :resource, depending on where it is used within a triple. Doing the same for tripleX would be fairly straight-forward (for me, anyway). The work in the sparql gem would be pretty similar.

> If this is relevant, I would be interested to work on a branch with such RDF* extension.

I think that would be pretty cool; I'd be happy to support you in doing this, but don't have any time to commit to it myself. We could potentially including this in a release, enabled via a runtime option.

Gregg

> Peter
> 
> 
> [0] http://arxiv.org/pdf/1406.3399v1.pdf
> [1] http://arxiv.org/abs/1406.3399 (Abstract for [0])
> [2] http://blog.bigdata.com/?p=716  (blog article by Bryan Thompson)
> [3] https://github.com/petervandenabeele/dbd/ (experimental log structured db)
[4] http://lists.w3.org/Archives/Public/public-rdf-wg/2013Nov/0109.html
[5] https://github.com/ruby-rdf/rdf-turtle/blob/develop/lib/rdf/turtle/reader.rb#L162
> 
> -- 
> Peter Vandenabeele
> http://www.linkedin.com/in/petervandenabeele
> https://github.com/petervandenabeele
> https://twitter.com/peter_v
> gsm: +32-478-27.40.69
> e-mail: peter@vandenabeele.com
> skype: peter_v_be
Received on Tuesday, 24 June 2014 23:23:28 UTC