Re: A single reifier can reify more than one triple term from Gregory Williams on 2024-03-27 (public-rdf-star-wg@w3.org from March 2024)

From: Gregory Williams <greg@evilfunhouse.com>
Date: Wed, 27 Mar 2024 07:17:06 -0700
To: Franconi Enrico <franconi@inf.unibz.it>
Cc: "Lassila, Ora" <ora@amazon.com>, RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-Id: <1A48E46F-3AF9-4C61-B534-D5BBABF5B019@evilfunhouse.com>

> On Mar 27, 2024, at 12:41 AM, Franconi Enrico <franconi@inf.unibz.it> wrote:
> 
>>> << :b1 | :enrico :married-in :rome >> :date 1962 .
>>> << :b1 | :enrico :married-on 1962 >> :location :rome .
>>> << :b1 | :enrico :married-in :rome >> :location :rome .
>>> << :b1 | :enrico :married-on 1962 >> :date 1962 .
>> 
>> It helps with the issue of naming, but it doesn’t address the asymmetry. Now Enrico has married-in and married-on properties, and the reification has date and location properties. Why is this a good model of properties that all come from the same relation where they are all properties of birth certificates?
> 
> They are not: has married-in and married-on have domain person, while date and location have domain birth certificate.

That is exactly the asymmetry I’m referring to. The *resulting* RDF model you’ve produced has sensible predicates, but the original data had properties that were all, equally, columns of the same table. Modeling them in this asymmetrical way stems from a modeling *decision* – more on that below.

> They NEED to be distinct properties, and depending on what are you talking about (people or birth certificates) you use the former of the latter.
> 
> 
>> And I still think this is a fundamental problem with this example: “two departments decide to expose this data as LOD, but in different ways.” That would be one thing if they were each exposing LOD using local identifiers, but they’ve both used the universal identifiers (b1, b2, …) for the reification in incompatible ways.
> 
> They are not incompatible.
> You are assuming that organisations are rational entities that structure their data in a syntactically uniform and consistent way all over the world. The fact that this assumption is not true is witnessed by the mess that enterprises have in doing data integration, which is the main raison d’être of semantic web technologies: deal with syntactically different ways of representing semantically equivalent information.

As somebody who has done exactly this sort of modeling professionally, I can assure you that I am not making that assumption. The fact that this is a complex modeling choice is all the more reason why I don’t think two different departments can make different modeling decisions while using the same universal identifiers.

I think there are many use cases for RDF-star that are not primarily focused on reification, and I wonder if the tentative agreement over the `rdf:reifies` naming didn't tacitly shift the discussion a bit in this regard. We seem to be talking a lot more about n-ary relations now, rather than “statements about statements.” I'm interested mostly in triple-level provenance/annotation, and improved interoperability with LPG data, while you seem primarily focused on reifications with n-ary relationship, and I think this discussion is revealing that the varying desires can be at odds with each other.

Thanks,
Greg

Received on Wednesday, 27 March 2024 14:17:22 UTC