Re: [External] : Re: A single reifier can reify more than one triple term from Souripriya Das on 2024-03-27 (public-rdf-star-wg@w3.org from March 2024)

From: Souripriya Das <souripriya.das@oracle.com>
Date: Wed, 27 Mar 2024 17:52:52 +0000
To: Gregory Williams <greg@evilfunhouse.com>, Franconi Enrico <franconi@inf.unibz.it>
CC: "Lassila, Ora" <ora@amazon.com>, RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <CY5PR10MB60711490904E3942E4B55FD2FA342@CY5PR10MB6071.namprd10.prod.outlook.com>
Relational model is suitable for n-ary relationships (with n foreign-keys, and some attribute columns). Since a graph model is necessarily a binary relationship at the ground level, translating an n-ary relationship from a relational model to a graph model could use two approaches:
    1) create a vertex for the relationship and hang n edges connecting that vertex to vertices corr to the foreign-key targets, respectively (plus any additional attributes for the relationship vertex as LPG-like vertex-properties), or
    2) figure out one of the n-permute-2 choices as the one to create an edge with and then hang the remaining (n-2) vertices from that edge (plus any additional LPG-like edge-properties).

If using the second approach, there are many (specifically, n-permute-2) choices as to how one wants to model it. Use of the first approach, on the other hand, can be an overkill sometimes, and if used a lot, one may question why a graph model is chosen over a relational model.

I think instead of trying to solve the modeling issue of n-ary using RDF1.2, we need to stay focused on supporting "edges" (or "named occurrence of triples" or "atomic reification") in its full generality.

Specifically, we need to ensure that all of the following can be treated as ordinary edges:
    a) vertex-to-vertex connections (i.e., conventional edges in LPG),
    b) edge-to-edge connections (not in LPG),
    c) edge-to-vertex and vertex-to-edge connections (not in LPG),
    d) vertex-properties (basic in LPG but no further annotations allowed),
    e) edge-properties (basic in LPG but no further annotations allowed)

Instead of discussing n-ary modeling using a graph, we need to debate whether a single "reifier" should be allowed to reify (using rdf:reifies, in a hub-and-spokes manner) multiple distinct triple-terms.

Thanks,
Souri.

________________________________
From: Gregory Williams <greg@evilfunhouse.com>
Sent: Wednesday, March 27, 2024 10:17 AM
To: Franconi Enrico <franconi@inf.unibz.it>
Cc: Lassila, Ora <ora@amazon.com>; RDF-star Working Group <public-rdf-star-wg@w3.org>
Subject: [External] : Re: A single reifier can reify more than one triple term



> On Mar 27, 2024, at 12:41 AM, Franconi Enrico <franconi@inf.unibz.it> wrote:
>
>>> << :b1 | :enrico :married-in :rome >> :date 1962 .
>>> << :b1 | :enrico :married-on 1962 >> :location :rome .
>>> << :b1 | :enrico :married-in :rome >> :location :rome .
>>> << :b1 | :enrico :married-on 1962 >> :date 1962 .
>>
>> It helps with the issue of naming, but it doesn’t address the asymmetry. Now Enrico has married-in and married-on properties, and the reification has date and location properties. Why is this a good model of properties that all come from the same relation where they are all properties of birth certificates?
>
> They are not: has married-in and married-on have domain person, while date and location have domain birth certificate.

That is exactly the asymmetry I’m referring to. The *resulting* RDF model you’ve produced has sensible predicates, but the original data had properties that were all, equally, columns of the same table. Modeling them in this asymmetrical way stems from a modeling *decision* – more on that below.

> They NEED to be distinct properties, and depending on what are you talking about (people or birth certificates) you use the former of the latter.
>
>
>> And I still think this is a fundamental problem with this example: “two departments decide to expose this data as LOD, but in different ways.” That would be one thing if they were each exposing LOD using local identifiers, but they’ve both used the universal identifiers (b1, b2, …) for the reification in incompatible ways.
>
> They are not incompatible.
> You are assuming that organisations are rational entities that structure their data in a syntactically uniform and consistent way all over the world. The fact that this assumption is not true is witnessed by the mess that enterprises have in doing data integration, which is the main raison d’être of semantic web technologies: deal with syntactically different ways of representing semantically equivalent information.

As somebody who has done exactly this sort of modeling professionally, I can assure you that I am not making that assumption. The fact that this is a complex modeling choice is all the more reason why I don’t think two different departments can make different modeling decisions while using the same universal identifiers.

I think there are many use cases for RDF-star that are not primarily focused on reification, and I wonder if the tentative agreement over the `rdf:reifies` naming didn't tacitly shift the discussion a bit in this regard. We seem to be talking a lot more about n-ary relations now, rather than “statements about statements.” I'm interested mostly in triple-level provenance/annotation, and improved interoperability with LPG data, while you seem primarily focused on reifications with n-ary relationship, and I think this discussion is revealing that the varying desires can be at odds with each other.

Thanks,
Greg
Received on Wednesday, 27 March 2024 17:53:10 UTC