AW: A few thoughts on RDF-star, Reification, and Labeled Property Graphs from Sasaki, Felix on 2024-04-09 (public-rdf-star-wg@w3.org from April 2024)

From: Sasaki, Felix <felix.sasaki@sap.com>
Date: Tue, 9 Apr 2024 06:33:15 +0000
To: James Anderson <anderson.james.1955@gmail.com>, RDF-star Working Group <public-rdf-star-wg@w3.org>
CC: "Lassila, Ora" <ora@amazon.com>, "Thompson, Bryan" <bryant@amazon.com>, "Bebee, Brad" <beebs@amazon.com>, "Schmidt, Michael" <schmdtm@amazon.com>, "Hartig, Olaf" <ohartig@amazon.com>, "Williams, Gregory" <ngregwil@amazon.com>
Message-ID: <AS8PR02MB6966842AA2B9FD216F2912CE85072@AS8PR02MB6966.eurprd02.prod.outlook.com>
„this third outcome, while valuable, is not one of the chartered tasks.
is it the intent of this note to suggest that the charter should be extended?“

I Interpret this rather as a statement about what would be a useful outcome of this group, for assuring the relevance of RDF compared to LPGs. And I have the same experience in my day to day job as Ora states it:
“Over the last several years we have seen LPGs increase their popularity thanks to easy-to-understand and easy-to-use features, even when RDF offers more sophisticated features such as (for example) easy graph merging, federated queries, and expressive schema languages.”

On

“This suggests to restrict the more capable model to conform with the limitations of the less capable model, not as a matter of usage or a conventional profile, but as a required characteristic.
why would one do this?”

Standardization history shows that restricting capabilities can be a path to wide adoption. XML is mostly a restriction of SGML, removing features without strong use case needs. One can argue about XML in general, I won’t do this here. I bring up this example just to claim that less features, motivated by the strength of use case needs, can be a good decision in standardization.

Best,

Felix

Von: James Anderson <anderson.james.1955@gmail.com>
Datum: Dienstag, 9. April 2024 um 01:49
An: RDF-star Working Group <public-rdf-star-wg@w3.org>
Cc: Lassila, Ora <ora@amazon.com>, Thompson, Bryan <bryant@amazon.com>, Bebee, Brad <beebs@amazon.com>, Schmidt, Michael <schmdtm@amazon.com>, Hartig, Olaf <ohartig@amazon.com>, Williams, Gregory <ngregwil@amazon.com>
Betreff: Re: A few thoughts on RDF-star, Reification, and Labeled Property Graphs
[Sie erhalten nicht häufig E-Mails von anderson.james.1955@gmail.com. Weitere Informationen, warum dies wichtig ist, finden Sie unter https://aka.ms/LearnAboutSenderIdentification ]

good morning;

> On 8. Apr 2024, at 23:42, Lassila, Ora <ora@amazon.com> wrote:
>
> The Amazon Neptune team is committed to lowering the barriers to the adoption of graph databases and graph-based computing. Our customers benefit when we reduce the conceptual and technological gap between RDF graphs and Labeled Property Graphs (LPGs). Over the last several years we have seen LPGs increase their popularity thanks to easy-to-understand and easy-to-use features, even when RDF offers more sophisticated features such as (for example) easy graph merging, federated queries, and expressive schema languages. The importance and relevance of interoperability between RDF and LPG was established several years ago at the W3C workshop on Web Standardization for Graph Data (Creating Bridges: RDF, Property Graph and SQL) [1]. While its origins are much older, the RDF-star Community Group was established in the wake of this event. We believe that improving the ability for RDF and LPG graphs to interoperate will benefit the entire graph community.
>
> As we see it, the most critical outcomes of the work of the RDF-star working group should include:
>     • Efficient RDF support for “edge properties”, including the ability to have different property sets for otherwise identical edges (LPGs do not have the restriction RDF has where triples are unique in a graph).
>     • Simple and clear RDF support for statements about statements (supporting provenance mechanisms and other identified use cases).
>     • Laying the groundwork for interoperability “in the data” between RDF and LPG languages (e.g., a single database that can expose both LPG and RDF based query languages over the same data).

this third outcome, while valuable, is not one of the chartered tasks.
is it the intent of this note to suggest that the charter should be extended?

>  The alignment of features and capabilities between RDF and LPGs is possible if there are no fundamental incompatibilities between the two graph models. The RDF-star Working Group’s original goal, an easy mechanism for making “statements about statements”, would make the gap between the two models significantly smaller; statements about statements are a feature similar to “edge properties” in LPGs, the lack of which in RDF we often hear cited as the reason users choose LPGs. On the other hand, the current proposal the WG is entertaining, the “single reifier multiple triples” model, has no clear counterpart in LPGs, renders the two graph models even more different than they are today, adds significant complexity (there are more expressive alternatives with simpler semantics), and makes it even more difficult to understand RDF reification rather than offering a conceptually simple framework.
>
> Limiting reifiers to single statements – and classifying scenarios with a single reifier for multiple statements as “non-well formed” – will bring the greatest benefit to the graph community at large. On the other hand, allowing a single reifier for multiple statements will make it very difficult to align the LPG and RDF models. Please see the examples below.

this suggests to restrict the more capable model to conform with the limitations of the less capable model, not as a matter of usage or a conventional profile, but as a required characteristic.
why would one do this?

this discussion conflates two aspects of the model:
- the cardinality of the identified statements
- the cardinality of the annotations on the identified entity

it should be possible to consider them independently.

there is nothing in the examples or commentary below which substantiates any argument beyond that a profile would be expeditious.

>
> We strongly believe that the continued relevance of RDF depends on establishing interoperability with LPGs. As stated above, RDF brings some tremendous advantages, and we are committed to bringing these advantages to the community of LPG users as well. We believe that this reflects the spirit of the W3C workshop on Web Standardization for Graph Data and resonates with inputs from some other members of the working group.
>
> [1] https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FData%2Fevents%2Fdata-ws-2019%2F&data=05%7C02%7Cfelix.sasaki%40sap.com%7Cbf70ce2de5a8402fe36f08dc58268add%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C638482169758422804%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=gODR%2BW50ndX9Hf26T4UQqkMiWHRkW1Q1zpDofVEjdTI%3D&reserved=0<https://www.w3.org/Data/events/data-ws-2019/>
>  Examples:
>
> # (A) An LPG edge with a single edge property.
> (s) - [p {ep: 1}] → (o)
>
> # (B) An interpretation of that in an SPOI model (where I is a statement identifier).
> # The OneGraph model is based on such SPOI tuples.
> s p o :sid1
> :sid1 ep 1 :sid2
>
> # (C) An RDF-star expression consisting of an asserted triple and a statement about
> # that.
> :s :p :o {| :ep 1 |} # with an anonymous identifier for the (s p o) statement.
>
> # (D) The RDF interpretation of that RDF-star expression.
> :s :p :o . # The asserted triple.
> _:b rdf:reifies <<( :s :p :o )>> . # A reifier for that triple.
> _:b :ep 1 . # Using that reifier to make an assertion about a triple occurrence.
>
> # Note that the LPG example (A), the SPOI interpretation (B), and the RDF model (D)
> # can be handled as exactly the same data within possible database implementations
> # such as proposed by Souri or by a OneGraph implementation.  The case where _:b is
> # replaced by an IRI can also be handled under LPG, 1G, etc.
>
> # Now, let us look at the case where different statements are assigned the same
> # reifier:
>
> # (E) Same reifier used in two expressions about different triples.
> :s1 :p :o {| :b | :ep 1 |} # a statement about a statement with reifier ":b".
> :s2 :p :o {| :b | :ep 2 |} # a statement about a different statement, same reifier.
>
> # This last case (E) has no sensible interpretation under LPG.
>
> # If we accept a constraint that using the same reifiers for different TripleTerms
> # is not well-formed, then we can maintain a consistent interpretation with LPG edge
> # properties.  Further, we can use explicit modelling to group statements and retain
> # transparency about the functional or semantic roles in such groupings.
>
> # (F) Two statements are being grouped by an explicit semantic relationship (:partOf).
> :s1 :p :o {| :b1 | :ep 1 |}
> :s2 :p :o {| :b2 | :ep 2 |}
> :b1 :partOf :b
> :b2 :partOf :b
>
> # We submit that this explicit modelling is more useful and preserves the alignment
> # with LPG and RDF which has such great value to the world wide graph community.
>   --
> Dr. Ora Lassila
> Principal Technologist, Amazon Neptune


---
james anderson | james@dydra.com
Received on Tuesday, 9 April 2024 06:33:22 UTC