Re: Standards for storing RDF/OWL in a property graph? from Olaf Hartig on 2018-04-08 (semantic-web@w3.org from April 2018)

From: Olaf Hartig <olaf.hartig@liu.se>
Date: Sun, 8 Apr 2018 10:45:36 +0200
To: Dave Raggett <dsr@w3.org>
CC: <semantic-web@w3.org>, Chris Mungall <cjmungall@lbl.gov>
Message-ID: <3036933.Pkv1uB7kpp@porty3>
Hi Dave,

Use cases for what?

If you mean use cases for RDF* as an intermediate model for reconciling RDF 
and PGs, Chris mentioned such a use case in his initial email in this thread. 
Other use cases could be related to a desire to manage RDF data in a graph 
database system to be able to use some of their query features (e.g., shortest 
paths) while still being able to also use SPARQL.

If you mean use cases for RDF* & SPARQL* as an approach to represent and query 
statement-level metadata in the RDF context, then it is exactly that, any type 
of metadata on the level of single triples (rather than sets thereof). For 
instance, it could certainty statements, trustworthiness scores, temporal 
annotations, and of course provenance information. For the latter, in a 
similar discussion on the PROV mailing list, I have outlined how easy it is to 
use RDF* with the PROV vocabulary:

http://lists.w3.org/Archives/Public/public-prov-comments/2017Oct/0003.html

Best,
Olaf


On lördag 7 april 2018 kl. 12:05:51 CEST Dave Raggett wrote:
> Hi Olaf,
> 
> That sounds interesting and worthy of further discussion. Do you have any
> information you can share on the use cases you’ve considered, and any
> experience you can share with us?
> > On 6 Apr 2018, at 21:15, Olaf Hartig <olaf.hartig@liu.se> wrote:
> > 
> > Hi Chris, Dave,
> > 
> > I am working on these topics on a conceptual level since some time now (I
> > mean topics such as statement-level metadata in RDF, mappings between RDF
> > graphs and PGs). Most of these efforts are centered around a proposal
> > that I call RDF* and SPARQL*. The idea of RDF* is to extend RDF with the
> > possibility to have triples as the subject or the object of other triples
> > (i.e., nested triples), and SPARQL* is a corresponding extension of
> > SPARQL in which triple patterns can be nested. You may think of these
> > extensions purely as syntactic sugar or, alternatively, as an actual
> > logical extension of the RDF data model. In my work I have provided the
> > foundation of both of these perspectives [1,2,3,4].
> > 
> > How is this related to statement-level metadata in RDF?
> > A nested triple can be interpreted to be a statement *about* the triple in
> > its subject (or object) position, and SPARQL* is a language in which
> > queries about RDF data with statement-level metadata can be expressed in
> > a very concise way. In this context I should mention that RDF* data and
> > SPARQL* queries can be mapped to standard RDF data and SPARQL queries
> > that use the RDF reification vocabulary (or related approaches) [3],
> > which makes it possible to support RDF* and SPARQL* via a small wrapper
> > on top of any ordinary triple store.
> > 
> > How is this related to mappings between RDF and the PG model?
> > First, notice that nested triples can be understood to be a form of edge-
> > properties, which is the main feature of PGs that is missing from RDF.
> > Hence, RDF* can serve nicely as an intermediate model for reconciling RDF
> > and PGs. As a basis of such a reconciliation, I have provided generic
> > mappings between RDF* and PGs [2].
> > 
> > I should also mention that the folks at Blazegraph have played an
> > essential
> > part in shaping the RDF* & SPARQL* proposal, and the Blazegraph triple
> > store supports it [5] (they have been calling this feature "Reification
> > Done Right" RDR ;)
> > Other triple store vendors have also expressed their interest in this
> > proposal. Additionally, a poster that I presented about it in last year's
> > ISWC was voted to receive the "peoples' choice best poster award" in that
> > conference, which seems to indicate that there is also some community
> > interest in this proposal.
> > 
> > If you want to read more about the proposal, I suggest you start with the
> > short paper (4 pages) written for the poster [4].
> > 
> > Best,
> > Olaf
> > 
> > http://olafhartig.de
> > 
> > 
> > [1] Olaf Hartig and Bryan Thompson: Foundations of an Alternative Approach
> > to Reification in RDF. In CoRR abs/1406.3399, 2014.
> > http://arxiv.org/pdf/1406.3399
> > 
> > [2] Olaf Hartig: Reconciliation of RDF* and Property Graphs. In CoRR abs/
> > 1409.3288, 2014.
> > http://arxiv.org/pdf/1409.3288
> > 
> > [3] Olaf Hartig: Foundations of RDF* and SPARQL* - An Alternative Approach
> > to Statement-Level Metadata in RDF. In Proceedings of the 11th Alberto
> > Mendelzon International Workshop on Foundations of Data Management (AMW),
> > 2017. http://olafhartig.de/files/Hartig_AMW2017_RDFStar.pdf
> > 
> > [4] Olaf Hartig: RDF* and SPARQL*: An Alternative Approach to Annotate
> > Statements in RDF. Poster Session at the 16th International Semantic Web
> > Conference (ISWC), 2017.
> > http://olafhartig.de/files/Hartig_ISWC2017_RDFStarPosterPaper.pdf
> > 
> > [5] https://wiki.blazegraph.com/wiki/index.php/Reification_Done_Right
> > 
> > On fredag 6 april 2018 kl. 12:35:18 CEST Dave Raggett wrote:
> >> Hi Chris,
> >> 
> >> Thanks for raising this question. More broadly, there are plenty of
> >> different kinds of annotations that could be applied to triples or quads,
> >> e.g. temporal, spatial, data quality, provenance, trust and so forth. It
> >> would be interesting to gather more details about the various use cases
> >> and
> >> how people have addressed them, along with the relationship to other
> >> graph
> >> formalisms including property graphs.  I am at an early stage of planning
> >> for a W3C workshop on this topic and others, with a view to building upon
> >> the experience of two decades of RDF and Linked Data. This is under my
> >> role
> >> as the W3C staff lead for work on web data standards.
> >> 
> >> Best regards,
> >> 
> >>     Dave
> >>> 
> >>> On 5 Apr 2018, at 18:24, Chris Mungall <cjmungall@lbl.gov> wrote:
> >>> 
> >>> Graph databases that use a property-graph model such as neo4j have a
> >>> certain level of popularity. Many people are storing ontologies and
> >>> knowledge graphs in these.
> >>> 
> >>> I'm not really interested in discussing pros/cons here, but am instead
> >>> wondering if there is interest in standards or best practices for
> >>> mapping
> >>> RDF/OWL to PGs (or if there are efforts I am missing). The key
> >>> mathematical difference between RDF and PGs is edge properties, but
> >>> there
> >>> are many other differences in practical implementations, e.g. URIs
> >>> typically not first-class.
> >>> 
> >>> I'm in the position of dealing with multiple neo4js from different
> >>> groups
> >>> each with their own interesting ways of tackling this. I'm able to
> >>> standardize this set but would like this to be part of a broader effort.
> >>> 
> >>> Examples of design decisions:
> >>> 
> >>> subClassOf-some-values-from: 4 edges (RDF) vs 1 edge? How to encode the
> >>> axiom pattern as edge properties? Make URIs the node ID, or have a
> >>> special property?
> >>> Bake in CURIEs as properties vs contract/expand as part of surrounding
> >>> infrastructure? Direct reification vs map to edge properties?
> >>> Annotation property assertions: edges or node properties?
> >>> How to handle reification on triples where the object is a literal and
> >>> the
> >>> PG node properties are simple maps non-"follow-me" axioms like
> >>> owl:disjointWith. Direct edges or alternate representation? How to map
> >>> named graphs to a 'flat' graph space. Duplicate nodes vs edge and node
> >>> properties? Store-specific concerns; e.g. populating 'label' in neo4j
> >>> (and yes, I know many of these things are arguably problems that go away
> >>> if you just use RDF directly, but if you want to have that discussion I
> >>> suggest starting a separate thread).
> >>> 
> >>> Of course, there are many assumptions baked in to how we might want to
> >>> decide on the above. OWL and property graphs serve different use cases.
> >>> You tend to want to avoid certain design patterns in non-RDF graph
> >>> databases since there are frequently implicit assumptions involving
> >>> graph
> >>> traversal. Yet there is a lot in common, and it seems to make sense to
> >>> avoid a proliferation of mappings. Even if there are too many use cases
> >>> to define a standard mapping, a best practices document (a la the n-ary
> >>> patterns W3C note) would be most welcome.
> >>> 
> >>> We have an ontology service layer on top of neo4j
> >>> (https://github.com/SciGraph/SciGraph
> >>> <https://github.com/SciGraph/SciGraph>) that implements a set of
> >>> mappings
> >>> from OWL described here:
> >>> 
> >>> https://github.com/SciGraph/SciGraph/wiki/Neo4jMapping
> >>> <https://github.com/SciGraph/SciGraph/wiki/Neo4jMapping> (looks a bit
> >>> ugly, it's all generated from junit tests)
> >>> 
> >>> In retrospect there are some things I would do differently. For example,
> >>> avoiding blank nodes as much as possible, especially for existential
> >>> restrictions. But I put this up as a strawman.
> >>> 
> >>> Are there efforts I am missing here? If not, are others interested and
> >>> how
> >>> should we proceed? Does it make sense to aim for a W3C note, or just
> >>> start with a shared google doc?
> >> 
> >> Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
> >> W3C Data Activity Lead & W3C champion for the Web of things
> 
> Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
> W3C Data Activity Lead & W3C champion for the Web of things
Received on Sunday, 8 April 2018 08:46:39 UTC