Re: Using rdf:asserts (for Truth) and rdf:reifies (for Hypothesis) to be on par with relational and LPG from Andy Seaborne on 2024-08-22 (public-rdf-star-wg@w3.org from August 2024)

From: Andy Seaborne <andy@apache.org>
Date: Thu, 22 Aug 2024 12:01:14 +0100
To: public-rdf-star-wg@w3.org
Message-ID: <dedc8bb8-d8f5-4ea7-b437-06bde6cf650e@apache.org>

On 22/08/2024 05:16, Souripriya Das wrote:
> Suppose there are two kinds of statements that we need to store using 
> our data model: Truth and Hypothesis.
 >
> Within an id-based scope (i.e., a scope that contains id-s-p-o tuples), 
> I'd like to specify an s-p-o either as a Truth or as a Hypothesis. If we 
> use a single RDF-blessed property, rdf:reifies, we can do this as follows:
>      :id rdf:reifies <<( :s :p :o )>> ; a :Truth .
>      OR
>      :id rdf:reifies <<( :s :p :o )>> ; a :Hypothesis .
> 
> What's the complexity? An extra triple has to be added to designate an 
> id-s-p-o tuple as a Truth or a Hypothesis. Retrieval of one kind of 
> statements via SPARQL then requires an additional triple-pattern: 
> Example=> { ?id rdf:reifies <<( ?s ?p ?o )>> ; a :Truth } .
>
> Doing this in RDF is more complex than what's needed in the case of 
> relational data. There we do not need to add any extra row to the table 
> to indicate this – we just need an extra column, in the existing row, to 
> be populated. SQL query simply needs an extra condition (e.g., 
> <tableAlias>.kind = 'Truth'), not an extra join.
> 
> Doing this in RDF is more complex than what's needed in the case of LPG 
> data as well. There, assuming edge is actually stored as a struct or 
> record, all that is needed is adding and populating an extra attribute 
> (i.e., edge-property). The query for the extended LPG data only needs an 
> extra filter (e.g., <edge-var>.kind = "Truth").
> 
> So, to avoid this Truth/Hypothesis handling complexity related 
> disadvantage compared to relational data and LPG, my suggestion would be 
> to integrate the Truth or Hypothesis indicator into the id-s-p-o tuple 
> itself by providing two distinct RDF-blessed properties, say rdf:asserts 
> (for Truth) and rdf:reifies (for Hypothesis) – all within the id-based 
> scope. (This does not affect s-p-o triples in any way because those are 
> not part of any id-based scope. Those are in a "default" (no id) scope.)

Truth/Hypothesis are not the only cases.

Consider a trivial vocabulary:

ex:added      -- fact added to the graph
ex:retracted  -- fact retracted
ex:source     -- URL where a triple occurred.

These are orthogonal aspects [1].

There could be a subproperty of rdf:reifies in a vocabulary that implies 
a Truth or Hypothesis class about the reifier making the extra explicit 
type triple unnecessary. Usage patterns could be a profile. A simple 
ingestion pipeline can materialize it if desired.

At the moment, there is limited to no experience of using RDF 1.2 for 
relational data or LPG mapping. I would hope a vocabulary and profile 
appear and mature in a longer timescale. Making anything RDF-blessed 
(meaning in the rdf: namespace) seems too early. It can emerge later - 
removing things is harder.

     Andy

> 
> Thanks,
> Souri.
> 
> 

[1] https://www.w3.org/2009/12/rdf-ws/papers/ws12

Received on Thursday, 22 August 2024 11:01:22 UTC