Re: do Property Graphs always assert annotated arcs? from Miel Vander Sande (UGent-imec) on 2019-09-23 (public-rdf-star@w3.org from September 2019)

From: Miel Vander Sande (UGent-imec) <Miel.VanderSande@UGent.be>
Date: Mon, 23 Sep 2019 08:02:16 +0000
To: Olaf Hartig <olaf.hartig@liu.se>
CC: Martynas Jusevičius <martynas@atomgraph.com>, "public-rdf-star@w3.org" <public-rdf-star@w3.org>
Message-ID: <D9B14723-0109-4EED-A787-8B8F517666A8@ugent.be>
Hi all,

From an adoption POV, having transformations only clearly doesn’t cut it. We have had multiple languages already, but the gap remains. If it’s about improving the practicality of RDF, some more semantics and notation should be decided on. Bonuspoints: the conversion will be easier (like Olaf already formalised).

With respect to Named Graphs fora, I suggest these discussions be held in the N3 community group: https://www.w3.org/community/n3-dev/.

IMHO, RDF* is completely independent from Named Graphs: the former annotates statements, the other one graphs (whatever the semantics of that may be). But let’s try not to convolute this mailing lists with that discussion, because it risks actually progressing with RDF*

Best regards,

Miel Vander Sande
Postdoctoral Researcher at IDLab, Ghent University, in collaboration with imec

AA Tower | Technologiepark 19 9052 Ghent
www.idlab.technology<http://www.idlab.technology>
@Miel_vds




On 20 Sep 2019, at 10:29, Olaf Hartig <olaf.hartig@liu.se<mailto:olaf.hartig@liu.se>> wrote:

Martynas,

On Thu, 2019-09-19 at 13:22 +0200, Martynas Jusevičius wrote:
Hi,

let me play a devil's advocate a little.

Rather than having a new data model that accommodates both RDF and
PGs, would it not suffice to define a round-trip transformation
between the two? In the spirit of GRDDL, R2RML, and CSV on the Web
specifications.

If we ignore the aim to also provide an alternative approach to
represent and to query statement-level metadata and triple annotations
in the RDF  context, and would be interested only in converting data
between RDF and PGs, then, yes, defining such transformations might be
sufficient.

By the way, in a recent paper I introduce a formal definition of a
mapping of Labeled Property Graphs to RDF* Graphs [1]. Then, by
combining this mapping with the RDF*-to-RDF mapping that I have
introduced in some of my earlier papers [2], you have a formal
foundation for the transformation you are asking for (at least one
direction). For the other direction, I defined some mappings in a
tech.report [3].

Olaf

[1]  O. Hartig: "Foundations to Query Labeled Property Graphs using
SPARQL*." In Proceedings of the 1st Int. Workshop on Approaches for
Making Data Interoperable (AMAR), Sep. 2019.
http://olafhartig.de/files/Hartig_AMAR2019_Preprint.pdf


[2] O. Hartig: "Foundations of RDF* and SPARQL* - An Alternative
Approach to Statement-Level Metadata in RDF." In Proceedings of the
11th Alberto Mendelzon Int. Workshop on Foundations of Data Management
(AMW), Jun. 2017.
http://olafhartig.de/files/Hartig_AMW2017_RDFStar.pdf


[3] Olaf Hartig: "Reconciliation of RDF* and Property Graphs." In CoRR
abs/1409.3288, Sep. 2014.
http://arxiv.org/pdf/1409.3288




Even if as a temporary solution, I think it could provide some
clarity
re. how RDF and PG models map to each other.

There is a document that touches on these topics, but unfortunately
does not really describe the process anyway:
"ETL from RDF to Property Graph-A Field Guide"

https://www.mitre.org/publications/technical-papers/etl-from-rdf-to-property-graph-a-field-guide


Martynas

On Thu, Sep 19, 2019 at 1:02 PM Olaf Hartig <olaf.hartig@liu.se>
wrote:

Pierre-Antoine,

On Wed, 2019-09-11 at 19:08 +0200, Pierre-Antoine Champin wrote:
Thanks Joshua and Jeff, for your answers.

That confirms my hunch, and reveals that bridging between RDF and
PG
may be more complex than I expected -- and, I think, other people
on
the list expected.

As I now see it, the existence of an edge in a PG may or may not
translate to the corresponding triple in RDF. This depends on the
attributes (if any) of that edge.

Now, am I right to assume that, in PG land, if the edge has no
attribute, it will typically be considered asserted, but if I add
an
attribute 'until: 2000-01-01', it is not asserted anymore?

I think when comparing PGs and RDF/RDF*, it is not so important to
distinguish whether an edge in a PG--or, more precisely, whatever
the
edge is supposed to represent--can be considered to be asserted or
not.
In PGs, every edge that has attributes (edge properties) exists in
the
graph. There is no way to associate attributes with a non-existent
edge. In contrast, in RDF, and also in RDF* (assuming SA mode), we
can
make statements about a triple that is not part of the graph
itself.

Olaf


If that's the case, we have an additional problem to represent
this
in RDF land, because we can not infer anything from the *absence*
of
a triple (in this case, the 'until' annotation). We work under
the
open world assumption...


On Fri 30 Aug 2019, 20:56 Jeff Lerman, <jeff.lerman@invitae.com>
wrote:
Ah, that’s more in-line with much of the other discussion so
far in
the group.

I would prefer a model in which it’s not possible to assert a
property on a non-existent edge.  RDF/SPARQL provide us at
least
two ways to handle edges that, as a consequence of their
properties, should not be considered to “exist” from the
perspective of a query:

1. Queries can be written to filter out any edges with
properties
that indicate that they are not valid (e.g., if we are
interested
in considering/recognizing only edges about marriages in
existence
at a given time, we should exclude those that have started
after
that time, or ended before that time)

2. We can segregate those edges to a named graph which we
exclude
from SPARQL queries.

I have more to say about ways to handle/leverage named graphs,
which might make my 2nd suggestion more palatable, but not sure
that this is the right forum for that.  If the extension I have
in
mind would solve this issue with RDF* though, maybe it is...

    Jeff Lerman AI Scientist Mobile: 510-495-4621
www.invitae.com


On Fri, Aug 30, 2019 at 11:23 AM Olaf Hartig <
olaf.hartig@liu.se>
wrote:
Jeff,

These are great examples for cases in which the properties
associated with edges in a graph may change over time without
affecting the existence of the edges themselves. However, I
think
Pierre-Antoine's question was focusing on the opposite: does
the
existence of an edge property always assume the existence of
the
edge with which it is associated.

Olaf

-----Original Message-----
From: Jeff Lerman <jeff.lerman@invitae.com>
To: Joshua Shinavier <joshsh@uber.com>
Cc: Pierre-Antoine Champin <
pierre-antoine.champin@univ-lyon1.fr>
, public-rdf-star@w3.org
Sent: Fri, 30 Aug 2019 18:27
Subject: Re: do Property Graphs always assert annotated arcs?

Hi all,

Most of my experience with graphs is with a frame-based
approach
that most
closely resembles a triple-store - not explicitly RDF but
close
enough.
I’ve been exploring both RDF/triple-stores and PGs as
candidates
to support
a new project.  I’ve been following the RDF* discussion with
interest.

For what it's worth, I wouldn’t assume that edge-metadata
(edge-
properties
in PG world) must be asserted at the time an edge is
asserted.
There are a
variety of scenarios in which one might wish to update that
metadata, and
I’m pretty sure there’s nothing technically preventing such
updates in
existing PG implementations.  For example, one might:

  - update metadata: alter the value of an already-asserted
property:value
  pair (e.g., a newer model indicates that the weight of an
edge
should be
  adjusted from 0.2 to 0.8)
  - add or subtract metadata: assert (or remove) a value for
a
property
  that was previously un-populated (or populated), to
reflect
new knowledge
  we have about a relationship.  The change could be
incremental
and need not
  affect other properties, so deleting-and-reasserting the
edge
with all of
  the other pre-existing (and unaffected) properties would
be
inappropriate.

—Jeff



[image: email_sig_logo_vert.png]

Jeff Lerman

AI Scientist

Mobile: 510-495-4621

www.invitae.com

[image: email_sig_social_linkedin.png]
<https://www.linkedin.com/in/jefflerman/>


On Thu, Aug 29, 2019 at 10:03 AM Joshua Shinavier <
joshsh@uber.com> wrote:

Hi Pierre,

Just a quick response from a representative "property
graph"

user. I have
not been active on this list so far, and actually mistook
your

email for a
gremlin-users post. So let me just say what I would have
said.

First of all, property graph frameworks are usually not

prescriptive about
semantics, so your property-qualified edge "means what you
want

it to
mean". At the same time, it is generally not the case that
an

edge
qualified with a property like "since" would be considered
to

be asserted,
independently of the property. A canonical example is the

TinkerPop toy
graph
<


http://tinkerpop.apache.org/docs/current/reference/#graph-computing

,
which has a "weight" property on each edge. The edge

created{peter, lop}
has a weight of 0.2, which basically means that the
statement

"Peter is a
creator of LOP" is a non-assertion. I read your :since and

:until example
exactly as you do: the statement spouse{alice, bob} is
asserted
conditionally on a logical point in time.

Josh


On Thu, Aug 29, 2019 at 8:36 AM Pierre-Antoine Champin <
pierre-antoine.champin@univ-lyon1.fr> wrote:

Hi all,

here is a question for those on the list who have
discussed

more than I
have with Property Graph users.

There seem to be a consensus here that in PG, arcs with

metadata are
asserted at the same time as they are annotated. This is

reflected in the
PG interpretation of RDF*, where:

   <<:alice :spouse :bob>> :since 2001-02-03^^xsd:date .

asserts exactly two triples.

But as I understand, PG people are also likely to express

things like:

   <<:alice :spouse :bob>> :since 2001-02-03^^xsd:date ;
       :until 2004-05-06^^xsd:date .

if Alice and Bob eventually got divorced.
In that situation, the arc <<:alice :spouse :bob>> should
*no

longer* be
considered asserted in the graph.

Question: is this scenario a plausible one in a PG
context?


On Fri 30 Aug 2019, 20:56 Jeff Lerman, <jeff.lerman@invitae.com>
wrote:
Ah, that’s more in-line with much of the other discussion so
far in
the group.

I would prefer a model in which it’s not possible to assert a
property on a non-existent edge.  RDF/SPARQL provide us at
least
two ways to handle edges that, as a consequence of their
properties, should not be considered to “exist” from the
perspective of a query:

1. Queries can be written to filter out any edges with
properties
that indicate that they are not valid (e.g., if we are
interested
in considering/recognizing only edges about marriages in
existence
at a given time, we should exclude those that have started
after
that time, or ended before that time)

2. We can segregate those edges to a named graph which we
exclude
from SPARQL queries.

I have more to say about ways to handle/leverage named graphs,
which might make my 2nd suggestion more palatable, but not sure
that this is the right forum for that.  If the extension I have
in
mind would solve this issue with RDF* though, maybe it is...

    Jeff Lerman AI Scientist Mobile: 510-495-4621
www.invitae.com


On Fri, Aug 30, 2019 at 11:23 AM Olaf Hartig <
olaf.hartig@liu.se>
wrote:
Jeff,

These are great examples for cases in which the properties
associated with edges in a graph may change over time without
affecting the existence of the edges themselves. However, I
think
Pierre-Antoine's question was focusing on the opposite: does
the
existence of an edge property always assume the existence of
the
edge with which it is associated.

Olaf

-----Original Message-----
From: Jeff Lerman <jeff.lerman@invitae.com>
To: Joshua Shinavier <joshsh@uber.com>
Cc: Pierre-Antoine Champin <
pierre-antoine.champin@univ-lyon1.fr>
, public-rdf-star@w3.org
Sent: Fri, 30 Aug 2019 18:27
Subject: Re: do Property Graphs always assert annotated arcs?

Hi all,

Most of my experience with graphs is with a frame-based
approach
that most
closely resembles a triple-store - not explicitly RDF but
close
enough.
I’ve been exploring both RDF/triple-stores and PGs as
candidates
to support
a new project.  I’ve been following the RDF* discussion with
interest.

For what it's worth, I wouldn’t assume that edge-metadata
(edge-
properties
in PG world) must be asserted at the time an edge is
asserted.
There are a
variety of scenarios in which one might wish to update that
metadata, and
I’m pretty sure there’s nothing technically preventing such
updates in
existing PG implementations.  For example, one might:

  - update metadata: alter the value of an already-asserted
property:value
  pair (e.g., a newer model indicates that the weight of an
edge
should be
  adjusted from 0.2 to 0.8)
  - add or subtract metadata: assert (or remove) a value for
a
property
  that was previously un-populated (or populated), to
reflect
new knowledge
  we have about a relationship.  The change could be
incremental
and need not
  affect other properties, so deleting-and-reasserting the
edge
with all of
  the other pre-existing (and unaffected) properties would
be
inappropriate.

—Jeff



[image: email_sig_logo_vert.png]

Jeff Lerman

AI Scientist

Mobile: 510-495-4621

www.invitae.com

[image: email_sig_social_linkedin.png]
<https://www.linkedin.com/in/jefflerman/>


On Thu, Aug 29, 2019 at 10:03 AM Joshua Shinavier <
joshsh@uber.com> wrote:

Hi Pierre,

Just a quick response from a representative "property
graph"

user. I have
not been active on this list so far, and actually mistook
your

email for a
gremlin-users post. So let me just say what I would have
said.

First of all, property graph frameworks are usually not

prescriptive about
semantics, so your property-qualified edge "means what you
want

it to
mean". At the same time, it is generally not the case that
an

edge
qualified with a property like "since" would be considered
to

be asserted,
independently of the property. A canonical example is the

TinkerPop toy
graph
<


http://tinkerpop.apache.org/docs/current/reference/#graph-computing

,
which has a "weight" property on each edge. The edge

created{peter, lop}
has a weight of 0.2, which basically means that the
statement

"Peter is a
creator of LOP" is a non-assertion. I read your :since and

:until example
exactly as you do: the statement spouse{alice, bob} is
asserted
conditionally on a logical point in time.

Josh


On Thu, Aug 29, 2019 at 8:36 AM Pierre-Antoine Champin <
pierre-antoine.champin@univ-lyon1.fr> wrote:

Hi all,

here is a question for those on the list who have
discussed

more than I
have with Property Graph users.

There seem to be a consensus here that in PG, arcs with

metadata are
asserted at the same time as they are annotated. This is

reflected in the
PG interpretation of RDF*, where:

   <<:alice :spouse :bob>> :since 2001-02-03^^xsd:date .

asserts exactly two triples.

But as I understand, PG people are also likely to express

things like:

   <<:alice :spouse :bob>> :since 2001-02-03^^xsd:date ;
       :until 2004-05-06^^xsd:date .

if Alice and Bob eventually got divorced.
In that situation, the arc <<:alice :spouse :bob>> should
*no

longer* be
considered asserted in the graph.

Question: is this scenario a plausible one in a PG
context?
Received on Monday, 23 September 2019 08:02:46 UTC