Re: do Property Graphs always assert annotated arcs?

Martynas,

On Thu, 2019-09-19 at 13:22 +0200, Martynas Jusevičius wrote:
> Hi,
> 
> let me play a devil's advocate a little.
> 
> Rather than having a new data model that accommodates both RDF and
> PGs, would it not suffice to define a round-trip transformation
> between the two? In the spirit of GRDDL, R2RML, and CSV on the Web
> specifications.

If we ignore the aim to also provide an alternative approach to
represent and to query statement-level metadata and triple annotations
in the RDF  context, and would be interested only in converting data
between RDF and PGs, then, yes, defining such transformations might be
sufficient.

By the way, in a recent paper I introduce a formal definition of a
mapping of Labeled Property Graphs to RDF* Graphs [1]. Then, by
combining this mapping with the RDF*-to-RDF mapping that I have
introduced in some of my earlier papers [2], you have a formal
foundation for the transformation you are asking for (at least one
direction). For the other direction, I defined some mappings in a
tech.report [3].

Olaf

[1]  O. Hartig: "Foundations to Query Labeled Property Graphs using
SPARQL*." In Proceedings of the 1st Int. Workshop on Approaches for
Making Data Interoperable (AMAR), Sep. 2019.
http://olafhartig.de/files/Hartig_AMAR2019_Preprint.pdf


[2] O. Hartig: "Foundations of RDF* and SPARQL* - An Alternative
Approach to Statement-Level Metadata in RDF." In Proceedings of the
11th Alberto Mendelzon Int. Workshop on Foundations of Data Management
(AMW), Jun. 2017.
http://olafhartig.de/files/Hartig_AMW2017_RDFStar.pdf


[3] Olaf Hartig: "Reconciliation of RDF* and Property Graphs." In CoRR
abs/1409.3288, Sep. 2014.
http://arxiv.org/pdf/1409.3288




> Even if as a temporary solution, I think it could provide some
> clarity
> re. how RDF and PG models map to each other.
> 
> There is a document that touches on these topics, but unfortunately
> does not really describe the process anyway:
> "ETL from RDF to Property Graph-A Field Guide"
> 
https://www.mitre.org/publications/technical-papers/etl-from-rdf-to-property-graph-a-field-guide

> 
> Martynas
> 
> On Thu, Sep 19, 2019 at 1:02 PM Olaf Hartig <olaf.hartig@liu.se>
> wrote:
> > 
> > Pierre-Antoine,
> > 
> > On Wed, 2019-09-11 at 19:08 +0200, Pierre-Antoine Champin wrote:
> > > Thanks Joshua and Jeff, for your answers.
> > > 
> > > That confirms my hunch, and reveals that bridging between RDF and
> > > PG
> > > may be more complex than I expected -- and, I think, other people
> > > on
> > > the list expected.
> > > 
> > > As I now see it, the existence of an edge in a PG may or may not
> > > translate to the corresponding triple in RDF. This depends on the
> > > attributes (if any) of that edge.
> > > 
> > > Now, am I right to assume that, in PG land, if the edge has no
> > > attribute, it will typically be considered asserted, but if I add
> > > an
> > > attribute 'until: 2000-01-01', it is not asserted anymore?
> > 
> > I think when comparing PGs and RDF/RDF*, it is not so important to
> > distinguish whether an edge in a PG--or, more precisely, whatever
> > the
> > edge is supposed to represent--can be considered to be asserted or
> > not.
> > In PGs, every edge that has attributes (edge properties) exists in
> > the
> > graph. There is no way to associate attributes with a non-existent
> > edge. In contrast, in RDF, and also in RDF* (assuming SA mode), we
> > can
> > make statements about a triple that is not part of the graph
> > itself.
> > 
> > Olaf
> > 
> > 
> > > If that's the case, we have an additional problem to represent
> > > this
> > > in RDF land, because we can not infer anything from the *absence*
> > > of
> > > a triple (in this case, the 'until' annotation). We work under
> > > the
> > > open world assumption...
> > > 
> > > 
> > > On Fri 30 Aug 2019, 20:56 Jeff Lerman, <jeff.lerman@invitae.com>
> > > wrote:
> > > > Ah, that’s more in-line with much of the other discussion so
> > > > far in
> > > > the group.
> > > > 
> > > > I would prefer a model in which it’s not possible to assert a
> > > > property on a non-existent edge.  RDF/SPARQL provide us at
> > > > least
> > > > two ways to handle edges that, as a consequence of their
> > > > properties, should not be considered to “exist” from the
> > > > perspective of a query:
> > > > 
> > > >  1. Queries can be written to filter out any edges with
> > > > properties
> > > > that indicate that they are not valid (e.g., if we are
> > > > interested
> > > > in considering/recognizing only edges about marriages in
> > > > existence
> > > > at a given time, we should exclude those that have started
> > > > after
> > > > that time, or ended before that time)
> > > > 
> > > >  2. We can segregate those edges to a named graph which we
> > > > exclude
> > > > from SPARQL queries.
> > > > 
> > > > I have more to say about ways to handle/leverage named graphs,
> > > > which might make my 2nd suggestion more palatable, but not sure
> > > > that this is the right forum for that.  If the extension I have
> > > > in
> > > > mind would solve this issue with RDF* though, maybe it is...
> > > > 
> > > >      Jeff Lerman AI Scientist Mobile: 510-495-4621 
> > > > www.invitae.com
> > > > 
> > > > 
> > > > On Fri, Aug 30, 2019 at 11:23 AM Olaf Hartig <
> > > > olaf.hartig@liu.se>
> > > > wrote:
> > > > > Jeff,
> > > > > 
> > > > > These are great examples for cases in which the properties
> > > > > associated with edges in a graph may change over time without
> > > > > affecting the existence of the edges themselves. However, I
> > > > > think
> > > > > Pierre-Antoine's question was focusing on the opposite: does
> > > > > the
> > > > > existence of an edge property always assume the existence of
> > > > > the
> > > > > edge with which it is associated.
> > > > > 
> > > > > Olaf
> > > > > 
> > > > > -----Original Message-----
> > > > > From: Jeff Lerman <jeff.lerman@invitae.com>
> > > > > To: Joshua Shinavier <joshsh@uber.com>
> > > > > Cc: Pierre-Antoine Champin <
> > > > > pierre-antoine.champin@univ-lyon1.fr>
> > > > > , public-rdf-star@w3.org
> > > > > Sent: Fri, 30 Aug 2019 18:27
> > > > > Subject: Re: do Property Graphs always assert annotated arcs?
> > > > > 
> > > > > Hi all,
> > > > > 
> > > > > Most of my experience with graphs is with a frame-based
> > > > > approach
> > > > > that most
> > > > > closely resembles a triple-store - not explicitly RDF but
> > > > > close
> > > > > enough.
> > > > > I’ve been exploring both RDF/triple-stores and PGs as
> > > > > candidates
> > > > > to support
> > > > > a new project.  I’ve been following the RDF* discussion with
> > > > > interest.
> > > > > 
> > > > > For what it's worth, I wouldn’t assume that edge-metadata
> > > > > (edge-
> > > > > properties
> > > > > in PG world) must be asserted at the time an edge is
> > > > > asserted.
> > > > > There are a
> > > > > variety of scenarios in which one might wish to update that
> > > > > metadata, and
> > > > > I’m pretty sure there’s nothing technically preventing such
> > > > > updates in
> > > > > existing PG implementations.  For example, one might:
> > > > > 
> > > > >    - update metadata: alter the value of an already-asserted
> > > > > property:value
> > > > >    pair (e.g., a newer model indicates that the weight of an
> > > > > edge
> > > > > should be
> > > > >    adjusted from 0.2 to 0.8)
> > > > >    - add or subtract metadata: assert (or remove) a value for
> > > > > a
> > > > > property
> > > > >    that was previously un-populated (or populated), to
> > > > > reflect
> > > > > new knowledge
> > > > >    we have about a relationship.  The change could be
> > > > > incremental
> > > > > and need not
> > > > >    affect other properties, so deleting-and-reasserting the
> > > > > edge
> > > > > with all of
> > > > >    the other pre-existing (and unaffected) properties would
> > > > > be
> > > > > inappropriate.
> > > > > 
> > > > > —Jeff
> > > > > 
> > > > > 
> > > > > 
> > > > > [image: email_sig_logo_vert.png]
> > > > > 
> > > > > Jeff Lerman
> > > > > 
> > > > > AI Scientist
> > > > > 
> > > > > Mobile: 510-495-4621
> > > > > 
> > > > > www.invitae.com
> > > > > 
> > > > > [image: email_sig_social_linkedin.png]
> > > > > <https://www.linkedin.com/in/jefflerman/>
> > > > > 
> > > > > 
> > > > > On Thu, Aug 29, 2019 at 10:03 AM Joshua Shinavier <
> > > > > joshsh@uber.com> wrote:
> > > > > 
> > > > > > Hi Pierre,
> > > > > > 
> > > > > > Just a quick response from a representative "property
> > > > > > graph"
> > > > > 
> > > > > user. I have
> > > > > > not been active on this list so far, and actually mistook
> > > > > > your
> > > > > 
> > > > > email for a
> > > > > > gremlin-users post. So let me just say what I would have
> > > > > > said.
> > > > > > 
> > > > > > First of all, property graph frameworks are usually not
> > > > > 
> > > > > prescriptive about
> > > > > > semantics, so your property-qualified edge "means what you
> > > > > > want
> > > > > 
> > > > > it to
> > > > > > mean". At the same time, it is generally not the case that
> > > > > > an
> > > > > 
> > > > > edge
> > > > > > qualified with a property like "since" would be considered
> > > > > > to
> > > > > 
> > > > > be asserted,
> > > > > > independently of the property. A canonical example is the
> > > > > 
> > > > > TinkerPop toy
> > > > > > graph
> > > > > > <
> > > > > 
> > > > > 
http://tinkerpop.apache.org/docs/current/reference/#graph-computing

> > > > > > ,
> > > > > > which has a "weight" property on each edge. The edge
> > > > > 
> > > > > created{peter, lop}
> > > > > > has a weight of 0.2, which basically means that the
> > > > > > statement
> > > > > 
> > > > > "Peter is a
> > > > > > creator of LOP" is a non-assertion. I read your :since and
> > > > > 
> > > > > :until example
> > > > > > exactly as you do: the statement spouse{alice, bob} is
> > > > > > asserted
> > > > > > conditionally on a logical point in time.
> > > > > > 
> > > > > > Josh
> > > > > > 
> > > > > > 
> > > > > > On Thu, Aug 29, 2019 at 8:36 AM Pierre-Antoine Champin <
> > > > > > pierre-antoine.champin@univ-lyon1.fr> wrote:
> > > > > > 
> > > > > > > Hi all,
> > > > > > > 
> > > > > > > here is a question for those on the list who have
> > > > > > > discussed
> > > > > 
> > > > > more than I
> > > > > > > have with Property Graph users.
> > > > > > > 
> > > > > > > There seem to be a consensus here that in PG, arcs with
> > > > > 
> > > > > metadata are
> > > > > > > asserted at the same time as they are annotated. This is
> > > > > 
> > > > > reflected in the
> > > > > > > PG interpretation of RDF*, where:
> > > > > > > 
> > > > > > >     <<:alice :spouse :bob>> :since 2001-02-03^^xsd:date .
> > > > > > > 
> > > > > > > asserts exactly two triples.
> > > > > > > 
> > > > > > > But as I understand, PG people are also likely to express
> > > > > 
> > > > > things like:
> > > > > > > 
> > > > > > >     <<:alice :spouse :bob>> :since 2001-02-03^^xsd:date ;
> > > > > > >         :until 2004-05-06^^xsd:date .
> > > > > > > 
> > > > > > > if Alice and Bob eventually got divorced.
> > > > > > > In that situation, the arc <<:alice :spouse :bob>> should
> > > > > > > *no
> > > > > 
> > > > > longer* be
> > > > > > > considered asserted in the graph.
> > > > > > > 
> > > > > > > Question: is this scenario a plausible one in a PG
> > > > > > > context?
> > > > > > > 
> > > 
> > > On Fri 30 Aug 2019, 20:56 Jeff Lerman, <jeff.lerman@invitae.com>
> > > wrote:
> > > > Ah, that’s more in-line with much of the other discussion so
> > > > far in
> > > > the group.
> > > > 
> > > > I would prefer a model in which it’s not possible to assert a
> > > > property on a non-existent edge.  RDF/SPARQL provide us at
> > > > least
> > > > two ways to handle edges that, as a consequence of their
> > > > properties, should not be considered to “exist” from the
> > > > perspective of a query:
> > > > 
> > > >  1. Queries can be written to filter out any edges with
> > > > properties
> > > > that indicate that they are not valid (e.g., if we are
> > > > interested
> > > > in considering/recognizing only edges about marriages in
> > > > existence
> > > > at a given time, we should exclude those that have started
> > > > after
> > > > that time, or ended before that time)
> > > > 
> > > >  2. We can segregate those edges to a named graph which we
> > > > exclude
> > > > from SPARQL queries.
> > > > 
> > > > I have more to say about ways to handle/leverage named graphs,
> > > > which might make my 2nd suggestion more palatable, but not sure
> > > > that this is the right forum for that.  If the extension I have
> > > > in
> > > > mind would solve this issue with RDF* though, maybe it is...
> > > > 
> > > >      Jeff Lerman AI Scientist Mobile: 510-495-4621 
> > > > www.invitae.com
> > > > 
> > > > 
> > > > On Fri, Aug 30, 2019 at 11:23 AM Olaf Hartig <
> > > > olaf.hartig@liu.se>
> > > > wrote:
> > > > > Jeff,
> > > > > 
> > > > > These are great examples for cases in which the properties
> > > > > associated with edges in a graph may change over time without
> > > > > affecting the existence of the edges themselves. However, I
> > > > > think
> > > > > Pierre-Antoine's question was focusing on the opposite: does
> > > > > the
> > > > > existence of an edge property always assume the existence of
> > > > > the
> > > > > edge with which it is associated.
> > > > > 
> > > > > Olaf
> > > > > 
> > > > > -----Original Message-----
> > > > > From: Jeff Lerman <jeff.lerman@invitae.com>
> > > > > To: Joshua Shinavier <joshsh@uber.com>
> > > > > Cc: Pierre-Antoine Champin <
> > > > > pierre-antoine.champin@univ-lyon1.fr>
> > > > > , public-rdf-star@w3.org
> > > > > Sent: Fri, 30 Aug 2019 18:27
> > > > > Subject: Re: do Property Graphs always assert annotated arcs?
> > > > > 
> > > > > Hi all,
> > > > > 
> > > > > Most of my experience with graphs is with a frame-based
> > > > > approach
> > > > > that most
> > > > > closely resembles a triple-store - not explicitly RDF but
> > > > > close
> > > > > enough.
> > > > > I’ve been exploring both RDF/triple-stores and PGs as
> > > > > candidates
> > > > > to support
> > > > > a new project.  I’ve been following the RDF* discussion with
> > > > > interest.
> > > > > 
> > > > > For what it's worth, I wouldn’t assume that edge-metadata
> > > > > (edge-
> > > > > properties
> > > > > in PG world) must be asserted at the time an edge is
> > > > > asserted.
> > > > > There are a
> > > > > variety of scenarios in which one might wish to update that
> > > > > metadata, and
> > > > > I’m pretty sure there’s nothing technically preventing such
> > > > > updates in
> > > > > existing PG implementations.  For example, one might:
> > > > > 
> > > > >    - update metadata: alter the value of an already-asserted
> > > > > property:value
> > > > >    pair (e.g., a newer model indicates that the weight of an
> > > > > edge
> > > > > should be
> > > > >    adjusted from 0.2 to 0.8)
> > > > >    - add or subtract metadata: assert (or remove) a value for
> > > > > a
> > > > > property
> > > > >    that was previously un-populated (or populated), to
> > > > > reflect
> > > > > new knowledge
> > > > >    we have about a relationship.  The change could be
> > > > > incremental
> > > > > and need not
> > > > >    affect other properties, so deleting-and-reasserting the
> > > > > edge
> > > > > with all of
> > > > >    the other pre-existing (and unaffected) properties would
> > > > > be
> > > > > inappropriate.
> > > > > 
> > > > > —Jeff
> > > > > 
> > > > > 
> > > > > 
> > > > > [image: email_sig_logo_vert.png]
> > > > > 
> > > > > Jeff Lerman
> > > > > 
> > > > > AI Scientist
> > > > > 
> > > > > Mobile: 510-495-4621
> > > > > 
> > > > > www.invitae.com
> > > > > 
> > > > > [image: email_sig_social_linkedin.png]
> > > > > <https://www.linkedin.com/in/jefflerman/>
> > > > > 
> > > > > 
> > > > > On Thu, Aug 29, 2019 at 10:03 AM Joshua Shinavier <
> > > > > joshsh@uber.com> wrote:
> > > > > 
> > > > > > Hi Pierre,
> > > > > > 
> > > > > > Just a quick response from a representative "property
> > > > > > graph"
> > > > > 
> > > > > user. I have
> > > > > > not been active on this list so far, and actually mistook
> > > > > > your
> > > > > 
> > > > > email for a
> > > > > > gremlin-users post. So let me just say what I would have
> > > > > > said.
> > > > > > 
> > > > > > First of all, property graph frameworks are usually not
> > > > > 
> > > > > prescriptive about
> > > > > > semantics, so your property-qualified edge "means what you
> > > > > > want
> > > > > 
> > > > > it to
> > > > > > mean". At the same time, it is generally not the case that
> > > > > > an
> > > > > 
> > > > > edge
> > > > > > qualified with a property like "since" would be considered
> > > > > > to
> > > > > 
> > > > > be asserted,
> > > > > > independently of the property. A canonical example is the
> > > > > 
> > > > > TinkerPop toy
> > > > > > graph
> > > > > > <
> > > > > 
> > > > > 
http://tinkerpop.apache.org/docs/current/reference/#graph-computing

> > > > > > ,
> > > > > > which has a "weight" property on each edge. The edge
> > > > > 
> > > > > created{peter, lop}
> > > > > > has a weight of 0.2, which basically means that the
> > > > > > statement
> > > > > 
> > > > > "Peter is a
> > > > > > creator of LOP" is a non-assertion. I read your :since and
> > > > > 
> > > > > :until example
> > > > > > exactly as you do: the statement spouse{alice, bob} is
> > > > > > asserted
> > > > > > conditionally on a logical point in time.
> > > > > > 
> > > > > > Josh
> > > > > > 
> > > > > > 
> > > > > > On Thu, Aug 29, 2019 at 8:36 AM Pierre-Antoine Champin <
> > > > > > pierre-antoine.champin@univ-lyon1.fr> wrote:
> > > > > > 
> > > > > > > Hi all,
> > > > > > > 
> > > > > > > here is a question for those on the list who have
> > > > > > > discussed
> > > > > 
> > > > > more than I
> > > > > > > have with Property Graph users.
> > > > > > > 
> > > > > > > There seem to be a consensus here that in PG, arcs with
> > > > > 
> > > > > metadata are
> > > > > > > asserted at the same time as they are annotated. This is
> > > > > 
> > > > > reflected in the
> > > > > > > PG interpretation of RDF*, where:
> > > > > > > 
> > > > > > >     <<:alice :spouse :bob>> :since 2001-02-03^^xsd:date .
> > > > > > > 
> > > > > > > asserts exactly two triples.
> > > > > > > 
> > > > > > > But as I understand, PG people are also likely to express
> > > > > 
> > > > > things like:
> > > > > > > 
> > > > > > >     <<:alice :spouse :bob>> :since 2001-02-03^^xsd:date ;
> > > > > > >         :until 2004-05-06^^xsd:date .
> > > > > > > 
> > > > > > > if Alice and Bob eventually got divorced.
> > > > > > > In that situation, the arc <<:alice :spouse :bob>> should
> > > > > > > *no
> > > > > 
> > > > > longer* be
> > > > > > > considered asserted in the graph.
> > > > > > > 
> > > > > > > Question: is this scenario a plausible one in a PG
> > > > > > > context?
> > > > > > > 

Received on Friday, 20 September 2019 08:30:06 UTC