- From: Chris Mungall <cjmungall@lbl.gov>
- Date: Thu, 05 Apr 2018 10:24:20 -0700
- To: "W3C Semantic Web IG" <semantic-web@w3.org>
- Message-ID: <FFEDA473-9EDF-4605-B359-F3A22119ABD7@lbl.gov>
Graph databases that use a property-graph model such as neo4j have a certain level of popularity. Many people are storing ontologies and knowledge graphs in these. I'm not really interested in discussing pros/cons here, but am instead wondering if there is interest in standards or best practices for mapping RDF/OWL to PGs (or if there are efforts I am missing). The key mathematical difference between RDF and PGs is edge properties, but there are many other differences in practical implementations, e.g. URIs typically not first-class. I'm in the position of dealing with multiple neo4js from different groups each with their own interesting ways of tackling this. I'm able to standardize this set but would like this to be part of a broader effort. Examples of design decisions: - subClassOf-some-values-from: 4 edges (RDF) vs 1 edge? How to encode the axiom pattern as edge properties? - Make URIs the node ID, or have a special property? - Bake in CURIEs as properties vs contract/expand as part of surrounding infrastructure? - Direct reification vs map to edge properties? - Annotation property assertions: edges or node properties? - How to handle reification on triples where the object is a literal and the PG node properties are simple maps - non-"follow-me" axioms like owl:disjointWith. Direct edges or alternate representation? - How to map named graphs to a 'flat' graph space. Duplicate nodes vs edge and node properties? - Store-specific concerns; e.g. populating 'label' in neo4j (and yes, I know many of these things are arguably problems that go away if you just use RDF directly, but if you want to have that discussion I suggest starting a separate thread). Of course, there are many assumptions baked in to how we might want to decide on the above. OWL and property graphs serve different use cases. You tend to want to avoid certain design patterns in non-RDF graph databases since there are frequently implicit assumptions involving graph traversal. Yet there is a lot in common, and it seems to make sense to avoid a proliferation of mappings. Even if there are too many use cases to define a standard mapping, a best practices document (a la the n-ary patterns W3C note) would be most welcome. We have an ontology service layer on top of neo4j (https://github.com/SciGraph/SciGraph) that implements a set of mappings from OWL described here: https://github.com/SciGraph/SciGraph/wiki/Neo4jMapping (looks a bit ugly, it's all generated from junit tests) In retrospect there are some things I would do differently. For example, avoiding blank nodes as much as possible, especially for existential restrictions. But I put this up as a strawman. Are there efforts I am missing here? If not, are others interested and how should we proceed? Does it make sense to aim for a W3C note, or just start with a shared google doc?
Received on Thursday, 5 April 2018 17:24:54 UTC