LPG-RDF1.2 interoperability is already nontrivial due to weakness of LPG edge-property vs RDF1.2 statement-about-statement

(Let me admit first that, for me, "edges" are fundamental, not triples, and, IMHO, not having user-friendly support for edges, and hence edge-properties and parallel edges, is a major deficiency in RDF1.1 that has let LPG zoom past RDF in popularity.)

Since the debate regarding whether rdf:reifies should be single-valued or multi-valued is considering its implications on interoperability of LPG and RDF1.2, we need to take into account the fact that the fundamental restriction in LPG that "edge-properties can only connect an edge to a scalar value" already makes conversion of RDF1.2 to LPG difficult because RDF1.2 supports unrestricted "statement about statement" capability. Let me explain.

Based on the latest thoughts  on RDF1.2, each "edge" will have an associated id ("reifier") that can be used to say anything about the edge. Besides allowing connecting an edge to a scalar value, this capability also allows the data creator to create an edge that connect a pair of edges, or a vertex and an edge, without any restructuring of pre-existing data. LPG, with its restriction that an edge-property can only connect an edge to a scalar value, cannot handle connecting a pair of edges or an edge with a vertex, so easily. It has to restructure the pre-existing data, jeopardizing the validity of pre-existing queries. I have illustrated this with a simple example below. (I had illustrated this deficiency of LPG with a similar example in the "Appendix: A Complete Example" section of [1] as well.)

So, capability wise, I'll argue that LPG is less capable than RDF1.2, and, regardless of whether we decide to restrict rdf:reifies to being single-valued or not, this deficiency in LPG will affect interoperability unless LPG is extended from what it is today.

Example to support my argument that LPG is less capable than RDF1.2:
=========================================================
Consider this "edge connecting two edges" scenario: We know initially the facts that John donated to a campaign (probably millions of dollars :-)) and that John was appointed (later) an ambassador (to some country). Later, some analysts connected the two facts, asserting that the first fact has influenced the second.

In RDF1.2, initially, we store the following two edges: (here, I use an edge as representing a named binary relationship instance, not just a reification):
-------------
      :e1 rdf:reifies <<( :john :donatesTo :campaign )>> .
      :e2 rdf:reifies <<( :john :appointedAs :ambassador )>> .

Later, we add the following to represent the "influenced" relationship:
      :e3 rdf:reifies <<( e1 :influenced :e2 )>> .

Since this involved only addition of a new edge, and no restructuring of the original edges, none of the pre-existing queries are affected.

In LPG, however, one cannot implement this without a major restructuring, involving "vertexification" of the edges:
---------
Initially, LPG would use the following two edges:
      (john) -[:donatesTo]-> (campaign)
      (john) -[:appointedAs]-> (ambassador)

Later, when the need arises to connect these two edges via the "influenced" relationship, LPG has to "vertexify" the original edges to get this done:
      (donatesToVertex) -[:donator]-> (john)
      (donatesToVertex) -[:receiver]-> (campaign)
      (appointedAsVertex) -[:appointee]-> (john)
      (appointedAsVertex) -[:role]-> (ambassador)

      (donatesToVertex) -[:influenced]-> (appointedAsVertex)

This restructuring in LPG case, completely invalidates all pre-existing queries that relied upon the edge labels :donatesTo and :appointedAs. All such queries now have to redesigned.

Thanks,
Souri.

[1] https://blogs.oracle.com/oraclespatial/post/modeling-evolving-data-in-graphs-the-power-of-rdf-quads

Received on Thursday, 11 April 2024 15:42:53 UTC