Re: [External] : labelled property graphs vs -star extension of RDFn vs -star extension of named graphs

Noticed a mistake in the automatic name generation logic. I have corrected the slides (and attached the corrected PDF here) to now reflect that
1) automatic generation of triple name uses only s, p, o -- it does not use the graph name, and
2) the generated name is unique only within the graph, not the dataset.

In the following example, when use of SPARQL FROM clauses leads to combining the three named graphs :g1, :g2, and :g3 into a default graph, the auto-named RDF triple will appear only once in the resulting default graph used for the query. The same thing would happen during graph merging as well.

Two named graphs in an RDF dataset:
==============================
graph :g1 {
  :Brazil :won :SWC | auto:Bwin .        # the generated auto-name will be something like auto:foo(:Brazil,:won,:SWC)
  auto:Bwin1 :year 1994 .
}

graph :g2 {
  :Brazil :won :SWC | auto:Bw1 .          # the generated auto-name will be the same as in :g1 (could have used auto:Bwin in this graph too)
  auto:Bw1 :beatInFinal :Italy .
}

graph :g3 {
  auto:Bwin :score :scoreSheet .
  :scoreSheet :winnerScored 3 .
  :scoreSheet :loserScored 2 .
}

If a SPARQL query against this dataset uses FROM :g1 FROM :g2 FROM :g3, then the resulting default graph used for the query would contain:
=================================
  :Brazil :won :SWC | auto:BrazilWin .   # the generated auto-name will be the same as in :g1 (appears once, no duplicates)
  auto:BrazilWin :year 1994 .
  auto:BrazilWin :beatInFinal :Italy .
  auto:BrazilWin :score :scoreSheet .
  :scoreSheet :winnerScored 3 .
  :scoreSheet :loserScored 2 .

In his last email, Ora probably was pointing out the potential for duplicate s-p-o, upon graph merging, because of this mistake.

Thanks,
Souri.

________________________________
From: Souripriya Das <souripriya.das@oracle.com>
Sent: Thursday, December 14, 2023 7:58 AM
To: Lassila, Ora <ora@amazon.com>; Peter F. Patel-Schneider <pfpschneider@gmail.com>; RDF-star Working Group <public-rdf-star-wg@w3.org>
Subject: Re: [External] : labelled property graphs vs -star extension of RDFn vs -star extension of named graphs

Good morning, Ora. Thanks for your question.

The answer to your question is:  No. Loading the same RDF1.1 triple multiple times, in a single batch or over multiple batches, will result in just a single triple being stored, not multiple triples.

If you look at the my linkedin post [1] (from about 4am EST this morning 🙂) that outlines the 14-DEC-2023 version of RDFn, in slide #3, "Cheat Sheet for Data Loading", I have shown how data presented to RDFn for loading gets stored in the triplestore.

So, when an RDF triple, :s :p :o ., is presented for loading, it is stored as the tuple <:s, :p, :o, rdft:foo(:s,:p,:o,g)>, where g is the target (default or named) graph. This is true for all auto-named cases. (For custom-named too, this will be the case, unless distinct custom-names are used.)

Example: In slide #6, "RDFn Data Loading: In Batches, Over Time", there is a more interesting example, but uses tokens:
========

Batch 1 has:
  :Brazil :won :SWC | auto:Bwin .
  ...

Batch 2 has:
  :Brazil :won :SWC | auto:Bw1 .                   # I intentionally used a different alias than in Batch 1 to emphasize that this will map to the pre-existing triple
   ...

After the loading of the two batches, over time, the store will have a single auto-named tuple, <:Brazil, :won, :SWC, auto:foo(:Brazil, :won, :SWC)>. This is shown in the accompanying diagram too.

[1] https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-rdf-extension-for-unasserted-and-named-activity-7140987047795744768-iqi7?utm_source=share&utm_medium=member_desktop

[cid:609b007e-5269-4e2c-8df0-562f03cdef21]<https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-rdf-extension-for-unasserted-and-named-activity-7140987047795744768-iqi7?utm_source=share&utm_medium=member_desktop>
Souripriya (Souri) Das, Ph.D. on LinkedIn: RDFn: RDF Extension for Unasserted and Named Triples<https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-rdf-extension-for-unasserted-and-named-activity-7140987047795744768-iqi7?utm_source=share&utm_medium=member_desktop>
RDFn: A Backward-Compatible RDF Extension for Unasserted and Named Triples (14-DEC-2023 version). These slides outline an extended version of RDFn, compared to…
www.linkedin.com


________________________________
From: Lassila, Ora <ora@amazon.com>
Sent: Thursday, December 14, 2023 7:17 AM
To: Souripriya Das <souripriya.das@oracle.com>; Peter F. Patel-Schneider <pfpschneider@gmail.com>; RDF-star Working Group <public-rdf-star-wg@w3.org>
Subject: Re: [External] : labelled property graphs vs -star extension of RDFn vs -star extension of named graphs


Souri,



If I understand this correctly, loading twice a document with a single triple (with no explicit name, so let’s say an RDF 1.1 document) results in two triples in the store? I am just trying to understand the ramifications to graph merging (something I think really sets RDF apart from LPGs).



Ora





From: Souripriya Das <souripriya.das@oracle.com>
Date: Tuesday, December 12, 2023 at 7:09 AM
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, RDF-star Working Group <public-rdf-star-wg@w3.org>
Subject: RE: [EXTERNAL] [External] : labelled property graphs vs -star extension of RDFn vs -star extension of named graphs
Resent-From: <public-rdf-star-wg@w3.org>
Resent-Date: Tuesday, December 12, 2023 at 7:08 AM



CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



Hi Peter,



Since Ted recently commented on my linkedin posting dated 21-SEP-2021 [1]  asking if I had any updates on RDFn draft spec, I worked through the weekend to create a new set of slides based on my latest thoughts on this and posted the slides on linkedin [2] yesterday (dated 11-DEC-2023):  Slides 3-5 cover the concepts, 6-7 show examples, 8-11 compare with RDF-star.



Although the syntax was tweaked a bit in the new version, the basic idea did not change: RDF's <s, p, o> is extended to <s, p, o, n> where n is an IRI representing the name of the triple (or, tName). This easily accommodates the LPG case that you mentioned and goes beyond LPG.



LPG:

:liz :spouse :dick {| :start 1964; :end 1974 |} .
:liz :spouse :dick {| :start 1975; :end 1976 |} .



RDFn: (one of the two tNames could be an auto-generated name, but I used custom-name for both just for simplicity):

:liz :spouse :dick | :term1 .

:liz :spouse :dick | :term2 .

:term1 :start 1964; :end 1974 .

:term2 :start 1975; :end 1976 .



Of course, LPG cannot easily do the following that we can in RDFn:

:term1 :happierThan :term2 .



Thanks,

Souri.



[1] https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-name-every-triple-or-quad-manually-activity-6846069087068540928-IKdd?utm_source=share&utm_medium=member_desktop<https://urldefense.com/v3/__https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-name-every-triple-or-quad-manually-activity-6846069087068540928-IKdd?utm_source=share&utm_medium=member_desktop__;!!ACWV5N9M2RV99hQ!LqweRNe7UsB7ALke-Y4yZGLwnPbVfEjcwTV9GnH5_iA_3RT-jSpwF7JwUkE3n-rWnDmqPAhQiTO3Vg$>



[2] https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-rdf-extension-for-unasserted-and-named-activity-7140162410769702912-kuqg?utm_source=share&utm_medium=member_desktop<https://urldefense.com/v3/__https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-rdf-extension-for-unasserted-and-named-activity-7140162410769702912-kuqg?utm_source=share&utm_medium=member_desktop__;!!ACWV5N9M2RV99hQ!LqweRNe7UsB7ALke-Y4yZGLwnPbVfEjcwTV9GnH5_iA_3RT-jSpwF7JwUkE3n-rWnDmqPAhT4Z2_RA$>

[cid:image001.png@01DA2E5D.873AB9E0]<https://urldefense.com/v3/__https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-rdf-extension-for-unasserted-and-named-activity-7140162410769702912-kuqg?utm_source=share&utm_medium=member_desktop__;!!ACWV5N9M2RV99hQ!LqweRNe7UsB7ALke-Y4yZGLwnPbVfEjcwTV9GnH5_iA_3RT-jSpwF7JwUkE3n-rWnDmqPAhT4Z2_RA$>

Souripriya (Souri) Das, Ph.D. on LinkedIn: RDFn: RDF Extension for Unasserted and Named Triples<https://urldefense.com/v3/__https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-rdf-extension-for-unasserted-and-named-activity-7140162410769702912-kuqg?utm_source=share&utm_medium=member_desktop__;!!ACWV5N9M2RV99hQ!LqweRNe7UsB7ALke-Y4yZGLwnPbVfEjcwTV9GnH5_iA_3RT-jSpwF7JwUkE3n-rWnDmqPAhT4Z2_RA$>

RDFn: A Backward-Compatible RDF Extension for Unasserted and Named Triples.

www.linkedin.com<https://urldefense.com/v3/__http://www.linkedin.com__;!!ACWV5N9M2RV99hQ!LqweRNe7UsB7ALke-Y4yZGLwnPbVfEjcwTV9GnH5_iA_3RT-jSpwF7JwUkE3n-rWnDmqPAiWMeYVtQ$>





[cid:image001.png@01DA2E5D.873AB9E0]<https://urldefense.com/v3/__https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-name-every-triple-or-quad-manually-activity-6846069087068540928-IKdd?utm_source=share&utm_medium=member_desktop__;!!ACWV5N9M2RV99hQ!LqweRNe7UsB7ALke-Y4yZGLwnPbVfEjcwTV9GnH5_iA_3RT-jSpwF7JwUkE3n-rWnDmqPAhQiTO3Vg$>

Souripriya (Souri) Das, Ph.D. on LinkedIn: RDFn: Name Every Triple or Quad: Manually or Automatically<https://urldefense.com/v3/__https://www.linkedin.com/posts/souripriya-souri-das-ph-d-48801911_rdfn-name-every-triple-or-quad-manually-activity-6846069087068540928-IKdd?utm_source=share&utm_medium=member_desktop__;!!ACWV5N9M2RV99hQ!LqweRNe7UsB7ALke-Y4yZGLwnPbVfEjcwTV9GnH5_iA_3RT-jSpwF7JwUkE3n-rWnDmqPAhQiTO3Vg$>

Attached a few slides* to go with my recently posted article [1] outlining a new version of RDFn (original version [2]) that minimizes the data creator’s…

www.linkedin.com<https://urldefense.com/v3/__http://www.linkedin.com__;!!ACWV5N9M2RV99hQ!LqweRNe7UsB7ALke-Y4yZGLwnPbVfEjcwTV9GnH5_iA_3RT-jSpwF7JwUkE3n-rWnDmqPAiWMeYVtQ$>





________________________________

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Sent: Friday, December 8, 2023 9:18 AM
To: RDF-star Working Group <public-rdf-star-wg@w3.org>
Subject: [External] : labelled property graphs vs -star extension of RDFn vs -star extension of named graphs



At the teleconference yesterday I mentioned that there could be user-visible
differences between different views of how to proceed, even when there is some
consensus that different views are essentially the same.

Here is one example of a user-visible divergence.  Consider the following
input, written in the community group syntax.

:liz :spouse :dick {| :start 1964; :end 1974 |} .
:liz :spouse :dick {| :start 1975; :end 1976 |} .

In the community graph version of RDF-star this results in one asserted triple
with subject :liz that is the subject of four triples.  In SPARQL-star, the BGP

:liz :spouse :dick {| :start 1964; :end 1976 |} .

would match against a graph constructed from this input.

In labelled property graphs this would appear to result in two asserted
triples with subject :liz, each with two property-value pairs.  The above BGP
would not match.

So there is a decided visible difference between the community graph version
of RDF-star and labelled property graphs.

If I am correct in reading the (sparse) information available about RDFn, a
-star extension of RDFn would conform to the community group reading.  So
there would be noticeable differences between an extended RDFn and labelled
property graphs.

I am not aware of any proposal for using named graphs that says what the above
would result in there, so it is unclear which side a named graphs version of
-star would fit into.

peter

Received on Saturday, 16 December 2023 12:30:32 UTC