Re: basing RDF-star on RDF reification from Thomas Lörtsch on 2022-12-15 (public-rdf-star-wg@w3.org from December 2022)

From: Thomas Lörtsch <tl@rat.io>
Date: Thu, 15 Dec 2022 16:15:50 +0100
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: Pierre-Antoine Champin <pierre-antoine@w3.org>, public-rdf-star-wg@w3.org
Message-Id: <A1C8CA55-CB30-4927-948B-AE4E9D002A32@rat.io>
Peter,


I’d like to discuss a problem. While writing this mail I seem to have convinced myself that it isn’t really a problem, but I still think it’s worth discussing.

RDF-star is defined on types whereas RDF standard reification specifically refers to occurrences. That difference is important and must somehow be captured. Otherwise we loose the meaning of the link between type and occurrence. The CG report has the following example:

_:a :occurrenceOf << :s :p :o >> ;
    :in <file1.ttl> ;
    dct:creator :alice.
_:b :occurrenceOf << :s :p :o >> ;
    :in <file2.ttl> ;
    dct:creator :bob.

In a recent thread on this list [0] different kinds of occurrences - :occursAs, :mention, :claim etc  - have been discussed to identify an occurrence of a type.


It is a tricky topic and it gets dangerously close to the set-based semantics of RDF, but it can’t be ignored if we want to avoid a new sort of complications: branching constructs in modelling and querying to account for triples with only one or with multiple (and multi-part) annotations.


I was fearing that the mapping you propose needs to be extended to differentiate between types and occurrences. Re-using, but also modifying your example from 10. Dec 2022, at 20:02 (see also in the quoted section below):

_:x1 :occurrenceOf <<a b c>>;
     e f .
_:x2 :occurrenceOf <<a b c>>;
     g h .

would map to:

_:x2 rdf:subject a .
_:x2 rdf:stated-subject "a"^^xsd:string .
_:x2 rdf:predicate b .
_:x2 rdf:stated-predicate "b"^^xsd:string .
_:x2 rdf:object c .
_:x2 rdf:stated-object "c"^^xsd:string .
_:x2 e f .

_:x3 rdf:subject a .
_:x3 rdf:stated-subject "a"^^xsd:string .
_:x3 rdf:predicate b .
_:x3 rdf:stated-predicate "b"^^xsd:string .
_:x3 rdf:object c .
_:x3 rdf:stated-object "c"^^xsd:string .
_:x3 g h .

to honour the distinction between the two occurrences.

Now of course the next problem is how to map annotations on the type to RDF: 

<<a b c>> e f .
<<a b c>> g h .

This is the default in RDF-star, but isn't covered by RDF standard reification. 

One could argue that types occur just as well as occurrences and that it is just a question of perspective if one interprets the commonality - "one more triple of type << a b c >>" - or the distinction - "this << a b c >> comes from a trustworthy source". 

Under that assumption your mapping works:

_:x1 rdf:subject a .
_:x1 rdf:stated-subject "a"^^xsd:string .
_:x1 rdf:predicate b .
_:x1 rdf:stated-predicate "b"^^xsd:string .
_:x1 rdf:object c .
_:x1 rdf:stated-object "c"^^xsd:string .
_:x1 e f .
_:x1 g h .

That seems good enough. 

Maybe we need another term to differentiate occurrences of the type from occurrences with a stronger identity of their own?

The other problems that the type-focused approach of RDF-star generates remain, but they are hardly the fault of this mapping.


Best,
Thomas


[0] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2022Dec/0013.html



> On 13. Dec 2022, at 16:45, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
> 
> As far as the Community Group proposal for RDF-star and SPARQL-star goes, there is no need to ever use the mapping unless you are determining RDF-star entailment.   Even then, the mapping just specifies what the results are so any method of getting to these results is fine.  SPARQL-star uses a different mechanism entirely so the mapping is not needed in SPARQL.
> 
> As far as my proposal goes, when determining entailment all that counts is the final result so there is again no need to use the mapping for determining entailment.  Of course, it is possible to use the mapping, as in my proposal embedded triples are purely syntactic sugar.  This allows current software to correctly process embedded triples without any change to the software.
> 
> In my proposal there is no need to extend SPARQL to handle embedded triples - all that is needed is to use the mapping to expand embedded triples and then use SPARQL as it currently exists.  (It could be useful to invert the mapping on the results of CONSTRUCT queries.)  Of course there is no need to implement embedded triples in SPARQL this way - any way that comes up with the same results is allowable - so in the end there is no need to use the mapping in my proposal either.
> 
> 
> The Community Group proposal has a particular form of referential transparency/opacity - blank nodes are transparent and everything else is opaque (even "01"^^xsd:int is different from "1"^^xsd:int).  There is no way change this.
> 
> In my proposal embedded triples have the same referential transparency/opacity but other stances are possible by constructing the appropriate reification and associated triples.
> 
> 
> peter
> 
> 
> 
> On 12/13/22 08:34, Thomas Lörtsch wrote:
>> I sympathise with Peter’s arguments pro this approach, especially that the mapping he proposes ensures backwards compatability to codebases and datasets that don’t support RDF-star.
>> 
>> However I don’t understand which mapping is to be employed when. It certainly makes a difference if I annotate a referentially opaque occurrence or a referentially transparent one. Would the annotation refer to the referentailly transparent occurrence and the rdf:stated-… statements would just serve to document the syntactic representation (and could be omitted if the usecase didn’t ask for such precision)?
>> 
>> 
>> Thomas
>> 
>>> On 10. Dec 2022, at 20:02, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>>> 
>>> There are two differences, and they make a difference.
>>> 
>>> First, and much less important, is the use of the RDF reification vocabulary, which reduces the number of new IRIs needed and connects embedded triples to standard RDF reification.
>>> 
>>> Second, and much more important, is that there is no escaping.   This means that embedded triples are strictly syntactic sugar, i.e., there is no semantic difference between:
>>> 
>>> <<a b c>> e f .
>>> 
>>> <<a b c>> g h .
>>> 
>>> and
>>> 
>>> _:x1 rdf:subject a .
>>> 
>>> _:x1 rdf:stated-subject "a"^^xsd:string .
>>> 
>>> _:x1 rdf:predicate b .
>>> 
>>> _:x1 rdf:stated-predicate "b"^^xsd:string .
>>> 
>>> _:x1 rdf:object c .
>>> 
>>> _:x1 rdf:stated-object "c"^^xsd:string .
>>> 
>>> _:x1 e f .
>>> 
>>> _:x1 g h .
>>> 
>>> This makes it possible to infer embedded triples without using the embedded triple syntax and also makes it possible have variations of opacity, e.g., creating entities that are just like embedded triples but whose predicates are transparent by not incorporating the rdf:stated-predicate triple.
>>> 
>>> 
>>> peter
>>> 
>>> 
>>> 
>>> for a, b, c, e, and f IRIs.
>>> 
>>> On 12/9/22 03:01, Pierre-Antoine Champin wrote:
>>>> Dear Peter,
>>>> 
>>>> I have been meaning to respond to this email for a while -- sorry for the long delay.
>>>> 
>>>> I don't see much difference between your proposal below and the one in the CG-report. More details below:
>>>> 
>>>> On 07/11/2022 15:59, Peter F. Patel-Schneider wrote:
>>>>> I think the working group should consider basing the definition of RDF-star on RDF reification.  Although the semantics of RDF reification are under-constrained they can be used to provide a meaning for embedded triples.  This has several beneficial effects.  First, RDF reification becomes more useful. Second, RDF systems that do not support RDF-star directly can act as if they do by creating RDF graphs that are the mapping of RDF-star graphs.  Third, RDF-star becomes a simple syntactical extension to RDF.  Fourth, only a little machinery is needed to define RDF-star.  Fifth, variations of embedded triples can be created and made to fit correctly with both RDF-star embedded triples and RDF reification even without any extension to RDF.
>>>>> 
>>>>> 
>>>>> 
>>>>> The basis of this definition of RDF-star is that embedded triples are a shorthand for RDF reification, which some added triples to account for their desired meaning.  These additions can be modified if a different desired meaning of embedded triples is used in RDF-star.  Some of this definition is shared with various existing proposals for defining RDF-star.
>>>>> 
>>>>> Start with the abstract syntax of embedded triples and RDF-star graphs as defined in RDF-star documents.
>>>> So, this proposal still relies on an RDF-star abstract syntax that extends the abstract syntax of RDF, right?
>>>>> 
>>>>> Define a mapping L on RDF literals and IRIs as follows:
>>>>> For an RDF literal t with lexical form l, optional language tag t, and datatype d, L(l) is the RDF literal with datatype xsd:string and lexical form "l"^^<d> or "l"@t^^<d>, as appropriate.
>>>>> For an IRI i, L(i) is the RDF literal with datatype xsd:string and lexical form i.
>>>>> 
>>>>> This mapping only works correctly if RDF IRIs cannot be confused with the mappings of RDF literals.  If this is not correct then use instead the lexical form enclosed in angle brackets.
>>>> this is basically the mapping L defined at https://www.w3.org/2021/12/rdf-star.html#mapping
>>>>> 
>>>>> Given a set of recognized datatypes, the mapping * from RDF-star graphs to RDF
>>>>> graphs is defined as follows:
>>>>> 
>>>>> Pick some embedded triple ( s, p, o ) such that none of s, p, and o are triples, replace all occurrences of the triple by a fresh blank node b, and add the triples
>>>>> 
>>>>>   ( b, rdf:type, rdf:Statement )
>>>>>   ( b, rdf:subject, s )
>>>>>   ( b, rdf:stated-subject, L(s) ) if s is not a blank node
>>>>>   ( b, rdf:predicate, p )
>>>>>   ( b, rdf:stated-predicate, L(p) )
>>>>>   ( b, rdf:object, o ) if o is not a malformed literal
>>>>>   ( b, rdf:stated-object, L(o) )  if o is not a blank node
>>>>> 
>>>>> Finish when there are no embedded triples left.
>>>> and this is very similar to the unstar mapping defined at https://www.w3.org/2021/12/rdf-star.html#mapping ,
>>>> with two main differences
>>>> 
>>>> - your proposal reuses the reification vocabulary, while the CG proposal defines a brand new vocabulary for this mapping. I don't think that this is a significant change -- although I get your point about making RDF reification more useful.
>>>> 
>>>> - your proposal does not "escape" the "reification vocabulary" in the original graph -- which the CG proposal does. Did you omit it on purpose? This could have strange cons
>>>> 
>>>> (aside remark: the "if o is not a malformed literal" was found to make the semantics non-monotonic, so we should probably not keep it: https://github.com/w3c/rdf-star/issues/262 )
>>>> 
>>>>> 
>>>>> An RDF-star graph G1 entails an RDF-star graph G2 in RDF-star iff G1* entails G2* in RDF.
>>>> this is also how it is defined in the CG report (modulo the differences between the mappings): https://www.w3.org/2021/12/rdf-star.html#entailment-of-rdf-star-graphs
>>>> 
>>>>   best
>>>> 
>>>>> 
>>>>> 
>>>>> Yours sincerely,
>>>>> 
>>>>> Peter F. Patel-Schneider
>>>>> 
>>>>> 
>>>>> 
>>>>>
Received on Thursday, 15 December 2022 15:16:14 UTC