Re: Standards for storing RDF/OWL in a property graph? from Olaf Hartig on 2018-04-06 (semantic-web@w3.org from April 2018)

From: Olaf Hartig <olaf.hartig@liu.se>
Date: Fri, 6 Apr 2018 22:15:18 +0200
To: <semantic-web@w3.org>
CC: Dave Raggett <dsr@w3.org>, Chris Mungall <cjmungall@lbl.gov>
Message-ID: <2182709.pSa1KPizYh@porty3>
Hi Chris, Dave,

I am working on these topics on a conceptual level since some time now (I mean 
topics such as statement-level metadata in RDF, mappings between RDF graphs 
and PGs). Most of these efforts are centered around a proposal that I call 
RDF* and SPARQL*. The idea of RDF* is to extend RDF with the possibility to 
have triples as the subject or the object of other triples (i.e., nested 
triples), and SPARQL* is a corresponding extension of SPARQL in which triple 
patterns can be nested. You may think of these extensions purely as syntactic 
sugar or, alternatively, as an actual logical extension of the RDF data model. 
In my work I have provided the foundation of both of these perspectives 
[1,2,3,4].

How is this related to statement-level metadata in RDF?
A nested triple can be interpreted to be a statement *about* the triple in its 
subject (or object) position, and SPARQL* is a language in which queries about 
RDF data with statement-level metadata can be expressed in a very concise way. 
In this context I should mention that RDF* data and SPARQL* queries can be 
mapped to standard RDF data and SPARQL queries that use the RDF reification 
vocabulary (or related approaches) [3], which makes it possible to support 
RDF* and SPARQL* via a small wrapper on top of any ordinary triple store.

How is this related to mappings between RDF and the PG model?
First, notice that nested triples can be understood to be a form of edge-
properties, which is the main feature of PGs that is missing from RDF. Hence, 
RDF* can serve nicely as an intermediate model for reconciling RDF and PGs. As 
a basis of such a reconciliation, I have provided generic mappings between 
RDF* and PGs [2].

I should also mention that the folks at Blazegraph have played an essential 
part in shaping the RDF* & SPARQL* proposal, and the Blazegraph triple store 
supports it [5] (they have been calling this feature "Reification Done Right" 
RDR ;)
Other triple store vendors have also expressed their interest in this 
proposal. Additionally, a poster that I presented about it in last year's ISWC 
was voted to receive the "peoples' choice best poster award" in that 
conference, which seems to indicate that there is also some community interest 
in this proposal.

If you want to read more about the proposal, I suggest you start with the 
short paper (4 pages) written for the poster [4].

Best,
Olaf

 http://olafhartig.de


[1] Olaf Hartig and Bryan Thompson: Foundations of an Alternative Approach to 
Reification in RDF. In CoRR abs/1406.3399, 2014.
http://arxiv.org/pdf/1406.3399

[2] Olaf Hartig: Reconciliation of RDF* and Property Graphs. In CoRR abs/
1409.3288, 2014.
http://arxiv.org/pdf/1409.3288

[3] Olaf Hartig: Foundations of RDF* and SPARQL* - An Alternative Approach to 
Statement-Level Metadata in RDF. In Proceedings of the 11th Alberto Mendelzon 
International Workshop on Foundations of Data Management (AMW), 2017.
http://olafhartig.de/files/Hartig_AMW2017_RDFStar.pdf

[4] Olaf Hartig: RDF* and SPARQL*: An Alternative Approach to Annotate 
Statements in RDF. Poster Session at the 16th International Semantic Web 
Conference (ISWC), 2017.
http://olafhartig.de/files/Hartig_ISWC2017_RDFStarPosterPaper.pdf

[5] https://wiki.blazegraph.com/wiki/index.php/Reification_Done_Right



On fredag 6 april 2018 kl. 12:35:18 CEST Dave Raggett wrote:
> Hi Chris,
> 
> Thanks for raising this question. More broadly, there are plenty of
> different kinds of annotations that could be applied to triples or quads,
> e.g. temporal, spatial, data quality, provenance, trust and so forth. It
> would be interesting to gather more details about the various use cases and
> how people have addressed them, along with the relationship to other graph
> formalisms including property graphs.  I am at an early stage of planning
> for a W3C workshop on this topic and others, with a view to building upon
> the experience of two decades of RDF and Linked Data. This is under my role
> as the W3C staff lead for work on web data standards.
> 
> Best regards,
>      Dave
> 
> > On 5 Apr 2018, at 18:24, Chris Mungall <cjmungall@lbl.gov> wrote:
> > 
> > Graph databases that use a property-graph model such as neo4j have a
> > certain level of popularity. Many people are storing ontologies and
> > knowledge graphs in these.
> > 
> > I'm not really interested in discussing pros/cons here, but am instead
> > wondering if there is interest in standards or best practices for mapping
> > RDF/OWL to PGs (or if there are efforts I am missing). The key
> > mathematical difference between RDF and PGs is edge properties, but there
> > are many other differences in practical implementations, e.g. URIs
> > typically not first-class.
> > 
> > I'm in the position of dealing with multiple neo4js from different groups
> > each with their own interesting ways of tackling this. I'm able to
> > standardize this set but would like this to be part of a broader effort.
> > 
> > Examples of design decisions:
> > 
> > subClassOf-some-values-from: 4 edges (RDF) vs 1 edge? How to encode the
> > axiom pattern as edge properties? Make URIs the node ID, or have a
> > special property?
> > Bake in CURIEs as properties vs contract/expand as part of surrounding
> > infrastructure? Direct reification vs map to edge properties?
> > Annotation property assertions: edges or node properties?
> > How to handle reification on triples where the object is a literal and the
> > PG node properties are simple maps non-"follow-me" axioms like
> > owl:disjointWith. Direct edges or alternate representation? How to map
> > named graphs to a 'flat' graph space. Duplicate nodes vs edge and node
> > properties? Store-specific concerns; e.g. populating 'label' in neo4j
> > (and yes, I know many of these things are arguably problems that go away
> > if you just use RDF directly, but if you want to have that discussion I
> > suggest starting a separate thread).
> > 
> > Of course, there are many assumptions baked in to how we might want to
> > decide on the above. OWL and property graphs serve different use cases.
> > You tend to want to avoid certain design patterns in non-RDF graph
> > databases since there are frequently implicit assumptions involving graph
> > traversal. Yet there is a lot in common, and it seems to make sense to
> > avoid a proliferation of mappings. Even if there are too many use cases
> > to define a standard mapping, a best practices document (a la the n-ary
> > patterns W3C note) would be most welcome.
> > 
> > We have an ontology service layer on top of neo4j
> > (https://github.com/SciGraph/SciGraph
> > <https://github.com/SciGraph/SciGraph>) that implements a set of mappings
> > from OWL described here:
> > 
> > https://github.com/SciGraph/SciGraph/wiki/Neo4jMapping
> > <https://github.com/SciGraph/SciGraph/wiki/Neo4jMapping> (looks a bit
> > ugly, it's all generated from junit tests)
> > 
> > In retrospect there are some things I would do differently. For example,
> > avoiding blank nodes as much as possible, especially for existential
> > restrictions. But I put this up as a strawman.
> > 
> > Are there efforts I am missing here? If not, are others interested and how
> > should we proceed? Does it make sense to aim for a W3C note, or just
> > start with a shared google doc?
> Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
> W3C Data Activity Lead & W3C champion for the Web of things
Received on Friday, 6 April 2018 20:17:05 UTC