Re: [External] : Re: one RDF1.2 "stated" 4-tuple per LPG edge from Souripriya Das on 2024-08-09 (public-rdf-star-wg@w3.org from August 2024)

From: Souripriya Das <souripriya.das@oracle.com>
Date: Fri, 9 Aug 2024 23:02:46 +0000
To: RDF-star WG <public-rdf-star-wg@w3.org>
Message-ID: <CY5PR10MB6071AA21E61376EA937DB58DFAB92@CY5PR10MB6071.namprd10.prod.outlook.com>
Here I am trying to argue that rdf:states is important to support because it allows one-to-one mapping that makes it simple for many data conversion situations we will encounter in practice.

Consider the following three types of tuples in RDF1.2 data:
  1) ("asserted") s-p-o triples => :s :p :o .
  2) "stated" id-s-p-o tuples => :id rdf:states <<( :s :p :o )>> .
  3) "reified" id-s-p-o tuples => :id rdf:reifies <<( :s :p :o )>> .

Expected relative frequency of such tuple types, IMO, based on the source for the RDF1.2 data:
A) Relational data: Relationships with properties => "stated" id-s-p-o tuples, plus annotations.
B) LPG data: (Asserted) edges, with properties => "stated" id-s-p-o tuples, plus annotations.
C) Data from scratch: A mix of three types of tuples.
D) RDF1.1 data: Asserted triples, with no annotations => s-p-o triples (may need "reified" id-s-p-o tuples later, to add annotations).

Given this expectation, I'd argue that providing support for "stated" id-s-p-o tuples is important because it allows use of a single stated id-s-p-o tuple to represent individual relationships in A and B, and also (for many cases) in C, as the following example shows:
    # [one-to-one] The following two single stated id-s-p-o tuples are sufficient for representing two LPG parallel edges ...
    :id1 rdf:states <<( :s :p :o )>> .
    :id2 rdf:states <<( :s :p :o )>> .
    # [NOT one-to-one] ... instead of requiring the following three tuples:
    :s :p :o .
    :id1 rdf:reifies <<( :s :p :o )>> .
    :id2 rdf:reifies <<( :s :p :o )>> .

Thanks,
Souri.

________________________________
From: Souripriya Das <souripriya.das@oracle.com>
Sent: Thursday, August 8, 2024 7:15 AM
To: ddooss@wp.pl <ddooss@wp.pl>; RDF-star WG <public-rdf-star-wg@w3.org>
Subject: Re: [External] : Re: one RDF1.2 "stated" 4-tuple per LPG edge

Hi Dominik,

>I agree that the approach "one PG edge -> one RDF triple" appears to be more natural and intuitive. It is easier to grasp and implement compared to mapping one LPG edge to multiple RDF triples.

> However, I am not in favor of the concept of introducing IDs. This assumes that the graph database has explicitly implemented identifiers, which is not true for all implementations. To the best of my knowledge, no PG standard enforces the use of such identifiers.

I agree that edge-IDs are not required in PG because those are not used as endpoints of other edges. There are other ways of arriving at an an edge to add annotations to it or retrieve annotations from it. In RDF1.2, however, triples can be used as endpoints (i.e., subject or object of other triples) and hence we need IDs for them. That is why I used edge-ID for the PG edge in my example to make it easier to illustrate its conversion to RDF1.2.

Thanks,
Souri.
________________________________
From: ddooss@wp.pl <ddooss@wp.pl>
Sent: Wednesday, August 7, 2024 4:36 PM
To: Souripriya Das <souripriya.das@oracle.com>; RDF-star WG <public-rdf-star-wg@w3.org>
Subject: [External] : Re: one RDF1.2 "stated" 4-tuple per LPG edge

Hi Souri,

I agree that the approach "one PG edge -> one RDF triple" appears to be more natural and intuitive. It is easier to grasp and implement compared to mapping one LPG edge to multiple RDF triples.

However, I am not in favor of the concept of introducing IDs. This assumes that the graph database has explicitly implemented identifiers, which is not true for all implementations. To the best of my knowledge, no PG standard enforces the use of such identifiers.

Best,
Dominik


Dnia 07 sierpnia 2024 19:24 Souripriya Das <souripriya.das@oracle.com> napisał(a):

Sorry, I was using a custom notation just to communicate the idea.

So, by (s) -[id:p]-> (o), I meant an edge with
- s and o as the source and destination, respectively
- id as its (unique) edge-id and
- p as its edge-type (or label).

________________________________

From: ddooss@wp.pl <ddooss@wp.pl>
Sent: Wednesday, August 7, 2024 12:22 PM
To: Souripriya Das <souripriya.das@oracle.com>; RDF-star WG <public-rdf-star-wg@w3.org>
Subject: [External] : Re: one RDF1.2 "stated" 4-tuple per LPG edge

Hi Souri,


Souripriya Das <souripriya.das@oracle.com>:

LPG edge:
        (s) -[id:p]-> (o)

A little explanation. What does the notation `id:p` mean in PG? Did you mean (s)-[p:p { "id": p }]->(o) (i.e. in the schema (s)-[p :p { "id" STRING }]->(o)) in GQL and related languages, e.g. Cypher? Or maybe your notation means something else?

Best,
Dominik
Received on Friday, 9 August 2024 23:03:11 UTC