Re: [External] : Re: handling ("asserted") s-p-o triples and ("stated" or "reified") id-s-p-o 4-tuples in RDF/SPARQL from Thomas Lörtsch on 2024-08-13 (public-rdf-star-wg@w3.org from August 2024)

From: Thomas Lörtsch <tl@rat.io>
Date: Tue, 13 Aug 2024 14:07:20 +0200
To: Souripriya Das <SOURIPRIYA.DAS@oracle.com>
Cc: RDF-star WG <public-rdf-star-wg@w3.org>
Message-Id: <4520D6C7-A1E4-420E-B307-3D72AF27655C@rat.io>
Hi Souri,

thank you for this proposal! It seems to fit well together with my recently updated proposal [0] in which I had elaborated on the semantics of 'rdf:states' as a proper representation of the meaning of the Turtle-star annotation syntax, but only sketched the role of SPARQL.

You say that "entailment does not play any role"  from which I take it that you plan to expand the query patterns into regular BGPs, or do you imagine a rule running in the background, materializing stated triple terms as assertions?

Concerning the patterns you propose: I interpret e.g. [AS] as a join, i.e. it requires ':s :p :o' to be present in the graph as both a statement and a 'stated term'. Do I understand that right?

IMO it is important to support not just joins, but more intricate combinations. The following is a complete set of patterns that differentiates between AND an OR combinations:
- asserted
- stated
- reified                              # OPTION 1
- asserted and stated
- asserted or stated                   # OPTION 2
- asserted and reified
- asserted or reified
- stated and reified
- stated or reified
- asserted and stated and reified
- asserted and stated or reified
- asserted or stated and reifeid
- asserted or stated or reifeid        # OPTION 3

("asserted" meaning a regular RDF statement in the graph, "stated" meaning an instantiation via 'rdfs:states', and "reified" meaning an instantiation via 'rdf:reifies')

I expect the OR queries to be at least as important as the regular JOINs of BGPs. 
IMO the query types annotated as OPTION 1-3 are bound to be especially relevant in practice:
1) only reified terms (to address the use case of unstated statements)
2) asserted OR stated (to get everything that considers the statement to be true)
3) everything, no matter if asserted OR stated OR reified (e.g. to get an overview)
I don’t see a way to provide nice syntactic support for all the combinations listed above, but those three should definitely get some syntactic sugar.

I always stress that I’m not good at SPARQL, but I’m trying. The following queries are my attempts at options 1-3 in standard SPARQl, asking also for eventual annotations. Given the example graph:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://ex.org/> .

:s :p :o1 .
:s :p :o2 .
:t2 rdfs:states <<( :s :p :o2 )>> ;
    :x :y .
:t3 rdf:reifies <<( :s :p :o3 )>> ;
    :x :z .


OPTION 1 (only reified)
=======================

SELECT *
WHERE {
    ?id rdf:reifies <<( :s :p ?o )>> .
    OPTIONAL { ?id ?a ?b . }
}

should return 
?id  ?o ?a ?b  
:t3 :o3 :x :z


OPTION 2 (asserted or stated)
=============================

SELECT *
WHERE {
    { :s :p ?o . }
    UNION 
    { ?id rdfs:states <<( :s :p ?o )>> .
      OPTIONAL { ?id ?a ?b . } }
}

should return
?id  ?o ?a ?b  
:t1 :o1
:t2 :o2 :x :y

OPTION 3 (everything)
=====================

SELECT *
WHERE {
    { :s :p ?o . }
    UNION 
    { ?id rdfs:states <<( :s :p ?o )>> .
      OPTIONAL { ?id ?a ?b . } }
    UNION 
    { ?id rdf:reifies <<( :s :p ?o )>> .
      OPTIONAL { ?id ?a ?b . } }
}

should return
?id  ?o ?a ?b  
:t1 :o1
:t2 :o2 :x :y
:t3 :o3 :x :z


SPARQL SYNTACTIC SUGAR
======================
might look like this (I'm post-pending the qualifiers you proposed in hope of better readability, and to align with the recent re-design of the Turtle-star annotation syntax):

OPTION 1 (only reified)

SELECT *
WHERE {
    :s :p ?o ~ :id
}

OPTION 2 (asserted or stated)

SELECT *
WHERE {
    :s :p ?o | ?id
}

OPTION 3 (everything)

SELECT *
WHERE {
    :s :p ?o * ?id
}

The symbols appear pretty weak, a keyword might be better, and of course the decision for post- instead of prefix warrants more debate. But it seems to me that it would be pretty good, and a mighty step forward in terms of expressivity, if something more or less like this would become possible.

Best,
Thomas


[0] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Aug/0007.html


> On 12. Aug 2024, at 14:27, Souripriya Das <SOURIPRIYA.DAS@oracle.com> wrote:
> 
> Here is a simplified version of the SPARQL pattern extensions needed to retrieve (asserted) s-p-o triples, stated id-s-p-o tuples, and reified id-s-p-o tuples. (See [1] and its followup in [2] for the original version.) Please note that this is purely for SPARQL –  entailment does not play any role.
> 
> Types of tuples :
> ============
>     type [A] => :s :p :o . ==> (asserted) s-p-o triple
>     type [S] => :id rdf:states <<( :s :p :o )>> . ==> "stated" id-s-p-o tuple (s-p-o triple under id)
>     type [R] => :id rdf:reifies <<( :s :p :o )>> . ==> "reified" id-s-p-o tuple (s-p-o tuple under id)
> 
> SPARQL pattern extensions: (to allow retrieval of all seven non-empty subsets in the power set: [A], [S], [R], [AS], [SR], [AR], [ASR] )
> =======================
> 0) [A] To find (asserted) s-p-o triples only ==> ?s ?p ?o . 
> 1) [AS] To find (asserted) s-p-o triples and "stated" id-s-p-o tuples ==> ?id ~ ?s ?p ?o .
> 2) [AR] To find (asserted) s-p-o triples and "reified" id-s-p-o tuples ==> ?id | ?s ?p ?o .
> 3) [ASR] To find all – (asserted) s-p-o triples, "stated" id-s-p-o tuples, and "reified" id-s-p-o tuples ==> ?id * ?s ?p ?o .
> 
> Use FILTER bound(?id)=TRUE to exclude the (asserted) s-p-o triples in choices 1, 2, and 3 above to implement retrieval choices [S], [R], [SR], respectively.
> 
> Thanks,
> Souri.
> 
> [1] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Aug/0010.html (Souri, Mon, 5 Aug 2024 19:29:29 +0000)
> [2] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Aug/0011.html (Souri, Mon, 5 Aug 2024 22:37:50 +0000)
> From: Souripriya Das <souripriya.das@oracle.com <mailto:souripriya.das@oracle.com>>
> Sent: Tuesday, August 6, 2024 9:47 AM
> To: Thomas Lörtsch <tl@rat.io <mailto:tl@rat.io>>
> Cc: RDF-star WG <public-rdf-star-wg@w3.org <mailto:public-rdf-star-wg@w3.org>>
> Subject: Re: [External] : Re: handling ("asserted") s-p-o triples and ("stated" or "reified") id-s-p-o 4-tuples in RDF/SPARQL
>  
> Hi Thomas,
> >> We can impose the following constraint on RDF graph content:
> >> The 4-tuple, id-s-p-o, must be unique in an RDF graph. If present, an id-s-p-o can either be "asserted" or "reified".
> >>
> >> SPARQL INSERT, upon successful completion, will have the following effect on the target RDF graph's content:
> >> - INSERT DATA { :id rdf:states <<( :s :p :o )>> } => the graph will contain id-s-p-o as "stated".
> >> - INSERT DATA { :id rdf:reifies <<( :s :p :o )>> } => the graph will contain id-s-p-o -- as "stated", if it was already present as "stated", and as "reified" otherwise.
> 
> > That misses solid support for annotations on statements without actually stating them, i.e. an annotation can not be understood as refering to an unstated (maybe even considered not true) triple if a token of that triple is present in the graph. In that respect it doesn’t go beyond the "working baseline".
> >
> > Why don’t you allow
> >
> >     :id rdf:states <<( :s :p :o )>> ;
> >         rdf:reifies <<( :s :p :o )>> .
> Use of two "parallel" 4-tuples (i.e., same s-p-o but distinct identifiers), one using rdf:states and the other using rdf:reifies, would allow annotations to be associated with the right "flavor" – "stated" or "reified" – of the same s-p-o.
>         :id1 rdf:states <<( :s :p :o )>> .
>         :id2 rdf:reifies <<( :s :p :o )>> .
> 
> Thanks,
> Souri.
> 
> From: Thomas Lörtsch <tl@rat.io <mailto:tl@rat.io>>
> Sent: Tuesday, August 6, 2024 6:52 AM
> To: Souripriya Das <souripriya.das@oracle.com <mailto:souripriya.das@oracle.com>>
> Cc: RDF-star WG <public-rdf-star-wg@w3.org <mailto:public-rdf-star-wg@w3.org>>
> Subject: [External] : Re: handling ("asserted") s-p-o triples and ("stated" or "reified") id-s-p-o 4-tuples in RDF/SPARQL
>  
> 
> 
> > On 5. Aug 2024, at 21:29, Souripriya Das <souripriya.das@oracle.com <mailto:souripriya.das@oracle.com>> wrote:
> > 
> > Notations I am using below for the three proposed categories of tuples in RDF1.2:
> > ============
> > - A: "Asserted" triples => present in graph as s-p-o
> > - S: "Stated under id" 4-tuples => id rdf:states s-p-o
> > - R: "Reified under id" 4-tuples => id rdf:reifies s-p-o
> > 
> > Issue A: Handling "stated" vs. "reified" for a given id and s-p-o?
> > =============================
> > I think it is better to stay within RDF for this and not involve RDFS rdfs:subPropertyOf.
> 
> I agree. Note that I went a slightly different route in my latest "Updated proposal" [0], replacing the rdfs:subPropertyOf relation with an RDFS entailment pattern, but in any case: I share the goal of staying within RDF.
> 
> > We can impose the following constraint on RDF graph content: 
> > The 4-tuple, id-s-p-o, must be unique in an RDF graph. If present, an id-s-p-o can either be "asserted" or "reified".
> > 
> > SPARQL INSERT, upon successful completion, will have the following effect on the target RDF graph's content:
> > - INSERT DATA { :id rdf:states <<( :s :p :o )>> } => the graph will contain id-s-p-o as "stated".
> > - INSERT DATA { :id rdf:reifies <<( :s :p :o )>> } => the graph will contain id-s-p-o -- as "stated", if it was already present as "stated", and as "reified" otherwise.
> 
> That misses solid support for annotations on statements without actually stating them, i.e. an annotation can not be understood as refering to an unstated (maybe even considered not true) triple if a token of that triple is present in the graph. In that respect it doesn’t go beyond the "working baseline". 
> 
> Why don’t you allow
> 
>     :id rdf:states <<( :s :p :o )>> ;
>         rdf:reifies <<( :s :p :o )>> .
> 
> Is your :id a function of (:s :p :o)?
> 
> > Issue B: Should presence of "stated" id-s-p-o guarantee presence of asserted" s-p-o?
> > ===================================
> > I do not think we should include such a guarantee.
> 
> Well, a "guarantee" would probably involve entailment.
> 
> > This is lossy unless an implementation handles it using reference counts. 
> 
> I don’t follow. What I’m proposing is rather a way of making SPARQL query not only triples but also stated triple terms. This seems to my naive eye like a simple "expansion" (hoping that I use the term correctly).
> 
> > It is better to keep presence or absence of "asserted" s-p-o triple completely independent of the presence or absence of "stated" id-s-p-o 4-tuples (for one or more values of id).
> 
> As hinted above I would find this approach more consistent if it always disambiguated three forms:
> - triple 
> - stated triple term
> - described/documented triple term
> and treat "reified" as superproperty of all triple terms.
> 
> But it seems to me that this is orthogonal to my issue with "stated", namely that I'm trying to find a way in which an annotation can with some certainty be understood to refer to a statement that is true in the graph - "some certainty" meaning that the certainty guaranteed by entailment is not available, but hopefully can be "emulated" as proposed in my updated proposal [0].
> 
> Best,
> Thomas
> 
> 
> [0] https://urldefense.com/v3/__https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Aug/0007.html__;!!ACWV5N9M2RV99hQ!PDW4kWQSwBzptj0gmnBo1agHMZZvywPt0ej3H90kC8o8WYaVeKoebjxbQrbvWz5TzKrPtsHdrxW5$
> 
> 
> > SPARQL: Combinations to match:
> > ===================
> > - A =>                ?s ?p ?o .    # Asserted
> > - S =>    ?id    ~    ?s ?p ?o .    # Stated
> > - R =>    ?id   | |   ?s ?p ?o .    # Reified
> > - AS =>   ?id    @    ?s ?p ?o .    # Asserted or Stated
> > - AR =>   ?id   |+|   ?s ?p ?o .    # Asserted or Reified
> > - SR =>   ?id   |~|   ?s ?p ?o .    # Stated or Reified
> > - ASR =>  ?id   |@|   ?s ?p ?o .    # Asserted or Stated or Reified
> > Note: When "asserted" s-p-o is matched, the ?id variable (if any) will not have a binding.
> 
> > Thanks,
> > Souri.
Received on Tuesday, 13 August 2024 12:07:31 UTC