Re: rdf:reifies many-to-many vs. many-to-one from Thomas Lörtsch on 2024-03-28 (public-rdf-star-wg@w3.org from March 2024)

From: Thomas Lörtsch <tl@rat.io>
Date: Thu, 28 Mar 2024 15:02:06 +0100
To: Souripriya Das <SOURIPRIYA.DAS@oracle.com>
Cc: RDF-star WG <public-rdf-star-wg@w3.org>
Message-Id: <80BA0291-FC42-4816-9932-19B92C0EE375@rat.io>

> On 28. Mar 2024, at 13:00, Souripriya Das <SOURIPRIYA.DAS@oracle.com> wrote:
> 
> Wondering if staying with many-to-one for rdf:reifies will keep things simpler for the reader. Consider the following example.
> 
> Assuming that the following should hold in a domain:
>     :Single owl:disjointWith :Married .
> 
> How do the following RDF datasets appear to a reader?
> DS-1 (requires many-to-many)=>
>     :e rdf:reifies <<( :s rdf:type :Married )>>, <<( :s rdf:type :Single )>> .
>     :e :accTo :marriageRegistrar .
> DS-2=>
>     :e1 rdf:reifies <<( :s rdf:type :Married )>> .
>     :e2 rdf:reifies <<( :s rdf:type :Single )>> .
>     :e1 :accTo :marriageRegistrar .
>     :e2 :accTo :marriageRegistrar .
> 
> Would the following be a reasonable assessment, keeping the (naive) reader in mind? 
> - DS-1 is more concise, but could be confusing. 
> - DS-2 is simpler and less confusing.

To me DS-1 feels immediatly familiar whereas DS-2 feels verbose. To me the verbosity of DS-2 is confusing, not the simple list of triple terms in DS-1.

Some thoughts:

Grouping reifications by their attribute can probably be considered a very basic use case, a need that will inevitably arise.

Up to now grouping is realized with named graphs, but there is strong opposition towards basing the annotation mechanism on named graphs. Ergo we should make sure that what we design doesn’t work only on single triple terms but also on sets of them.

That should be done in a way that users don’t need to know upfront if an annotation targets a reification refering to only a single or triple term or a multiple thereof. (That should be easy with SPARQL-star, but is impossible when single triple annotations are encoded as triple term reifications but multiples thereof are encoded as named graphs).

The annotation syntax, e.g.  '<< :e1 | :s :p :o >>', should be extended to allow multiple triples as well, e.g. '<< :e2 | :s :p :o. :x :y :z . >>'. Otherwise annotating multiple triple terms always has to resort to the more verbose N-triples syntax with explicit rdf:reifies statements.

Such a solution would make sematically sound grouping available to RDF proper. The guidance w.r.t. named graphs would be to only use them for application specific purposes, outside the realm of data sharing and integration. This would mean that we bite the bullet that named graphs can not be saved for anything else than out-of-band activities. Note that this is not my position, but it is a position that would allow us to move forward. 

Keeping named graphs as a (semantically unsound) grouping device and designing triple term annotations as a one-trick pony to enable LPG-style modelling in RDF is not a very elegant and coherent design, and that lack of elegance and coherence will lead to a lot of questions, frustrations, need for explanations - exactly the thing that Ora fears.

Note also that another need will inevitably appear as well: the desire to state AND annotate a set of statements in one go, leading to the need for another syntactic device.

Please note as well that the Nested Named Graphs proposal [0] has all those issues and needs covered. However, it stumbled into a roadblock that so far we weren’t able to overcome: SPARQL is not really made for querying quads and annotations too easily can get lost in the course of a query. That requires a lot more effort than we can currently master.

Best,
Thomas

[0] https://github.com/rat10/nng

> Thanks,
> Souri.

Received on Thursday, 28 March 2024 14:02:16 UTC