Standards for storing RDF/OWL in a property graph?

Graph databases that use a property-graph model such as neo4j have a 
certain level of popularity. Many people are storing ontologies and 
knowledge graphs in these.

I'm not really interested in discussing pros/cons here, but am instead 
wondering if there is interest in standards or best practices for 
mapping RDF/OWL to PGs (or if there are efforts I am missing). The key 
mathematical difference between RDF and PGs is edge properties, but 
there are many other differences in practical implementations, e.g. URIs 
typically not first-class.

I'm in the position of dealing with multiple neo4js from different 
groups each with their own interesting ways of tackling this. I'm able 
to standardize this set but would like this to be part of a broader 
effort.

Examples of design decisions:

  - subClassOf-some-values-from: 4 edges (RDF) vs 1 edge? How to encode 
the axiom pattern as edge properties?
  - Make URIs the node ID, or have a special property?
  - Bake in CURIEs as properties vs contract/expand as part of 
surrounding infrastructure?
  - Direct reification vs map to edge properties?
  - Annotation property assertions: edges or node properties?
  - How to handle reification on triples where the object is a literal 
and the PG node properties are simple maps
  - non-"follow-me" axioms like owl:disjointWith. Direct edges or 
alternate representation?
  - How to map named graphs to a 'flat' graph space. Duplicate nodes vs 
edge and node properties?
  - Store-specific concerns; e.g. populating 'label' in neo4j

(and yes, I know many of these things are arguably problems that go away 
if you just use RDF directly, but if you want to have that discussion I 
suggest starting a separate thread).

Of course, there are many assumptions baked in to how we might want to 
decide on the above. OWL and property graphs serve different use cases. 
You tend to want to avoid certain design patterns in non-RDF graph 
databases since there are frequently implicit assumptions involving 
graph traversal. Yet there is a lot in common, and it seems to make 
sense to avoid a proliferation of mappings. Even if there are too many 
use cases to define a standard mapping, a best practices document (a la 
the n-ary patterns W3C note) would be most welcome.

We have an ontology service layer on top of neo4j 
(https://github.com/SciGraph/SciGraph) that implements a set of mappings 
from OWL described here:

https://github.com/SciGraph/SciGraph/wiki/Neo4jMapping

(looks a bit ugly, it's all generated from junit tests)

In retrospect there are some things I would do differently. For example, 
avoiding blank nodes as much as possible, especially for existential 
restrictions. But I put this up as a strawman.

Are there efforts I am missing here? If not, are others interested and 
how should we proceed? Does it make sense to aim for a W3C note, or just 
start with a shared google doc?

Received on Thursday, 5 April 2018 17:24:54 UTC