Re: basing RDF-star on RDF reification from Peter F. Patel-Schneider on 2022-12-16 (public-rdf-star-wg@w3.org from December 2022)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Fri, 16 Dec 2022 12:17:02 -0500
To: Thomas Lörtsch <tl@rat.io>
Cc: Pierre-Antoine Champin <pierre-antoine@w3.org>, public-rdf-star-wg@w3.org
Message-ID: <43c95b77-4d85-8ea2-2718-3ac2517412ef@gmail.com>
The RDF semantics itself does not say anything about whether an instance of 
rdf:Statement is an occurrence indeed anything else.  There is informative 
wording about the intent of instances of rdf:Statement, but I don't see any 
reason to prevent carving out some instances of rdf:Satement and making them 
act more like abstract triples.


peter


On 12/16/22 09:52, Thomas Lörtsch wrote:
>
>> On 15. Dec 2022, at 18:00, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>>
>> As far as I can tell, neither RDF standard reification nor RDF-star embedded triples are tied to either types or occurrences and taking about types vs occurrences is not helpful.
> I could have sworn that the RDF Semantics talks about occurrences, but it does indeed not. However, it refers to a "particular instance or token of a triple". Well, that’s what I meant anyway…
>
>
> It is indeed a good question if it is helpful to differentiate between types and tokens/occurrences. The RDF Semantics does, but it argues with a specific use case "where properties such as dates of composition or provenance information are applied to the reified triple". That is a constraint that has always bugged me as being too tight, but in the perspective of LPG compatability it is especially problematic.
>
> I understand the semantics of RDF standard reification as an attempt to keep statements and statement annotations separate, treating statement annotations as merely descriptions of some kind of speech acts (describing the event of stating a certain statement), and avoiding non-monotonic expressions, paradoxes and endless recursions. To me this was never quite satisfactory and proper treatment of LPGs seems to demand a less restrained approach.
>
> In LPG land annotations often represent secondary detail, not just metadata like provenance as prevalent in RDF land. I would like to get rid of the distinction between "metadata" annotations like provenance and "qualifying" annotations like further detail or context alltogether. What is interpreted as "meta" and what as "context" depends on the use case. Use cases change, the data should not. So in my book everything that doesn’t outright negate a statement adds further detail. I consider qualification not as a problem, but a feature. I do however have the feeling that some people consider it a danger to monotonicity.
>
>
>> What you do get in RDF-star is a kind of uniqueness, namely that the basic idea in RDF-star is that there is only one embedded triple with the syntactically same subject, predicate, and object.   But this is not really supported in the RDF-star semantics, which is based on a mapping to regular RDF.  This mapping does not produce uniqueness or even identity of embedded triples.
>> A semantics that supports the uniqueness of embedded triples would have to extend the RDF semantics in a significant manner.
>> An analogy would be if there was a programming language that had some sort of set construct and stated that two sets with the same members were identical but implemented sets as lists, patching up equality so that within one module two lists representing the same set acted as if they were identical but lists representing the same set would not act as identical if they came from different modules.
>> Standard reification in RDF itself does not provide uniqueness of reified triples at all,
>> of course.  One could extend the RDF semantics to require uniqueness of various kinds.
> Okay, this whole section leaves me with a lot of question marks, but as Pierre-Antoine has already replied to it I will wait for your response to him.
>
>> Once you have any form of uniqueness for statements then you generally need some sort of way to talk about statings of statements.  I'm not keen on using :occurrenceOf for this purpose but I suppose there is no real harm in using this name for the relationship.
> I really wonder where I got the term occurrence from. My Topic Maps background? But that is a long time ago. The advantage of the term "occurrence" is that it also provides the verb "occur". That makes it easier to talk about it. "Token" provides no verb at all - "something is tokenized" sounds rather cruelsome - and "instantiating" and "is instantiated" is much longer than "occurring" and "occurs". "Token" would otherwise be my favorite, because "instance" and "instantiation" is also used in relation with "subproperty", "subclass", "subclassing" etc, but that IMO is a topic that should be handled separately. So, going with "occurrence" and "occurs" has some benefits.
>
>
>> I think you have an error in the expansion in your message.  My proposal for embedded triples would treat
>>
>> _:x1 :occurrenceOf <<a b c>>;
>>       e f .
>> _:x2 :occurrenceOf <<a b c>>;
>>       g h .
>>
>> as syntactic shorthand for
>>
>> _:x3 rdf:subject a .
>> _:x3 rdf:stated-subject "a"^^xsd:string .
>> _:x3 rdf:predicate b .
>> _:x3 rdf:stated-predicate "b"^^xsd:string .
>> _:x3 rdf:object c .
>> _:x3 rdf:stated-object "c"^^xsd:string .
>> _:x1 :occurrenceOf _:x3 .
>> _:x1 e f .
>> _:x2 :occurrenceOf _:x3 .
>> _:x2 g h .
>>
>> I don't see how this is significantly different from the situation in RDF-star.
> You are right!
>
>
> Thomas
>
>
>
>> peter
>>
>>
>>
>> On 12/15/22 10:15, Thomas Lörtsch wrote:
>>> Peter,
>>>
>>>
>>> I’d like to discuss a problem. While writing this mail I seem to have convinced myself that it isn’t really a problem, but I still think it’s worth discussing.
>>>
>>> RDF-star is defined on types whereas RDF standard reification specifically refers to occurrences. That difference is important and must somehow be captured. Otherwise we loose the meaning of the link between type and occurrence. The CG report has the following example:
>>>
>>> _:a :occurrenceOf << :s :p :o >> ;
>>>      :in <file1.ttl> ;
>>>      dct:creator :alice.
>>> _:b :occurrenceOf << :s :p :o >> ;
>>>      :in <file2.ttl> ;
>>>      dct:creator :bob.
>>>
>>> In a recent thread on this list [0] different kinds of occurrences - :occursAs, :mention, :claim etc  - have been discussed to identify an occurrence of a type.
>>>
>>>
>>> It is a tricky topic and it gets dangerously close to the set-based semantics of RDF, but it can’t be ignored if we want to avoid a new sort of complications: branching constructs in modelling and querying to account for triples with only one or with multiple (and multi-part) annotations.
>>>
>>>
>>> I was fearing that the mapping you propose needs to be extended to differentiate between types and occurrences. Re-using, but also modifying your example from 10. Dec 2022, at 20:02 (see also in the quoted section below):
>>>
>>> _:x1 :occurrenceOf <<a b c>>;
>>>       e f .
>>> _:x2 :occurrenceOf <<a b c>>;
>>>       g h .
>>>
>>> would map to:
>>>
>>> _:x2 rdf:subject a .
>>> _:x2 rdf:stated-subject "a"^^xsd:string .
>>> _:x2 rdf:predicate b .
>>> _:x2 rdf:stated-predicate "b"^^xsd:string .
>>> _:x2 rdf:object c .
>>> _:x2 rdf:stated-object "c"^^xsd:string .
>>> _:x2 e f .
>>>
>>> _:x3 rdf:subject a .
>>> _:x3 rdf:stated-subject "a"^^xsd:string .
>>> _:x3 rdf:predicate b .
>>> _:x3 rdf:stated-predicate "b"^^xsd:string .
>>> _:x3 rdf:object c .
>>> _:x3 rdf:stated-object "c"^^xsd:string .
>>> _:x3 g h .
>>>
>>> to honour the distinction between the two occurrences.
>>>
>>> Now of course the next problem is how to map annotations on the type to RDF:
>>>
>>> <<a b c>> e f .
>>> <<a b c>> g h .
>>>
>>> This is the default in RDF-star, but isn't covered by RDF standard reification.
>>>
>>> One could argue that types occur just as well as occurrences and that it is just a question of perspective if one interprets the commonality - "one more triple of type << a b c >>" - or the distinction - "this << a b c >> comes from a trustworthy source".
>>>
>>> Under that assumption your mapping works:
>>>
>>> _:x1 rdf:subject a .
>>> _:x1 rdf:stated-subject "a"^^xsd:string .
>>> _:x1 rdf:predicate b .
>>> _:x1 rdf:stated-predicate "b"^^xsd:string .
>>> _:x1 rdf:object c .
>>> _:x1 rdf:stated-object "c"^^xsd:string .
>>> _:x1 e f .
>>> _:x1 g h .
>>>
>>> That seems good enough.
>>>
>>> Maybe we need another term to differentiate occurrences of the type from occurrences with a stronger identity of their own?
>>>
>>> The other problems that the type-focused approach of RDF-star generates remain, but they are hardly the fault of this mapping.
>>>
>>>
>>> Best,
>>> Thomas
>>>
>>>
>>> [0] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2022Dec/0013.html
>>>
>>>
>>>
>>>> On 13. Dec 2022, at 16:45, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>>>>
>>>> As far as the Community Group proposal for RDF-star and SPARQL-star goes, there is no need to ever use the mapping unless you are determining RDF-star entailment.   Even then, the mapping just specifies what the results are so any method of getting to these results is fine.  SPARQL-star uses a different mechanism entirely so the mapping is not needed in SPARQL.
>>>>
>>>> As far as my proposal goes, when determining entailment all that counts is the final result so there is again no need to use the mapping for determining entailment.  Of course, it is possible to use the mapping, as in my proposal embedded triples are purely syntactic sugar.  This allows current software to correctly process embedded triples without any change to the software.
>>>>
>>>> In my proposal there is no need to extend SPARQL to handle embedded triples - all that is needed is to use the mapping to expand embedded triples and then use SPARQL as it currently exists.  (It could be useful to invert the mapping on the results of CONSTRUCT queries.)  Of course there is no need to implement embedded triples in SPARQL this way - any way that comes up with the same results is allowable - so in the end there is no need to use the mapping in my proposal either.
>>>>
>>>>
>>>> The Community Group proposal has a particular form of referential transparency/opacity - blank nodes are transparent and everything else is opaque (even "01"^^xsd:int is different from "1"^^xsd:int).  There is no way change this.
>>>>
>>>> In my proposal embedded triples have the same referential transparency/opacity but other stances are possible by constructing the appropriate reification and associated triples.
>>>>
>>>>
>>>> peter
>>>>
>>>>
>>>>
>>>> On 12/13/22 08:34, Thomas Lörtsch wrote:
>>>>> I sympathise with Peter’s arguments pro this approach, especially that the mapping he proposes ensures backwards compatability to codebases and datasets that don’t support RDF-star.
>>>>>
>>>>> However I don’t understand which mapping is to be employed when. It certainly makes a difference if I annotate a referentially opaque occurrence or a referentially transparent one. Would the annotation refer to the referentailly transparent occurrence and the rdf:stated-… statements would just serve to document the syntactic representation (and could be omitted if the usecase didn’t ask for such precision)?
>>>>>
>>>>>
>>>>> Thomas
>>>>>
>>>>>> On 10. Dec 2022, at 20:02, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>>>>>>
>>>>>> There are two differences, and they make a difference.
>>>>>>
>>>>>> First, and much less important, is the use of the RDF reification vocabulary, which reduces the number of new IRIs needed and connects embedded triples to standard RDF reification.
>>>>>>
>>>>>> Second, and much more important, is that there is no escaping.   This means that embedded triples are strictly syntactic sugar, i.e., there is no semantic difference between:
>>>>>>
>>>>>> <<a b c>> e f .
>>>>>>
>>>>>> <<a b c>> g h .
>>>>>>
>>>>>> and
>>>>>>
>>>>>> _:x1 rdf:subject a .
>>>>>>
>>>>>> _:x1 rdf:stated-subject "a"^^xsd:string .
>>>>>>
>>>>>> _:x1 rdf:predicate b .
>>>>>>
>>>>>> _:x1 rdf:stated-predicate "b"^^xsd:string .
>>>>>>
>>>>>> _:x1 rdf:object c .
>>>>>>
>>>>>> _:x1 rdf:stated-object "c"^^xsd:string .
>>>>>>
>>>>>> _:x1 e f .
>>>>>>
>>>>>> _:x1 g h .
>>>>>>
>>>>>> This makes it possible to infer embedded triples without using the embedded triple syntax and also makes it possible have variations of opacity, e.g., creating entities that are just like embedded triples but whose predicates are transparent by not incorporating the rdf:stated-predicate triple.
>>>>>>
>>>>>>
>>>>>> peter
>>>>>>
>>>>>>
>>>>>>
>>>>>> for a, b, c, e, and f IRIs.
>>>>>>
>>>>>> On 12/9/22 03:01, Pierre-Antoine Champin wrote:
>>>>>>> Dear Peter,
>>>>>>>
>>>>>>> I have been meaning to respond to this email for a while -- sorry for the long delay.
>>>>>>>
>>>>>>> I don't see much difference between your proposal below and the one in the CG-report. More details below:
>>>>>>>
>>>>>>> On 07/11/2022 15:59, Peter F. Patel-Schneider wrote:
>>>>>>>> I think the working group should consider basing the definition of RDF-star on RDF reification.  Although the semantics of RDF reification are under-constrained they can be used to provide a meaning for embedded triples.  This has several beneficial effects.  First, RDF reification becomes more useful. Second, RDF systems that do not support RDF-star directly can act as if they do by creating RDF graphs that are the mapping of RDF-star graphs.  Third, RDF-star becomes a simple syntactical extension to RDF.  Fourth, only a little machinery is needed to define RDF-star.  Fifth, variations of embedded triples can be created and made to fit correctly with both RDF-star embedded triples and RDF reification even without any extension to RDF.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The basis of this definition of RDF-star is that embedded triples are a shorthand for RDF reification, which some added triples to account for their desired meaning.  These additions can be modified if a different desired meaning of embedded triples is used in RDF-star.  Some of this definition is shared with various existing proposals for defining RDF-star.
>>>>>>>>
>>>>>>>> Start with the abstract syntax of embedded triples and RDF-star graphs as defined in RDF-star documents.
>>>>>>> So, this proposal still relies on an RDF-star abstract syntax that extends the abstract syntax of RDF, right?
>>>>>>>> Define a mapping L on RDF literals and IRIs as follows:
>>>>>>>> For an RDF literal t with lexical form l, optional language tag t, and datatype d, L(l) is the RDF literal with datatype xsd:string and lexical form "l"^^<d> or "l"@t^^<d>, as appropriate.
>>>>>>>> For an IRI i, L(i) is the RDF literal with datatype xsd:string and lexical form i.
>>>>>>>>
>>>>>>>> This mapping only works correctly if RDF IRIs cannot be confused with the mappings of RDF literals.  If this is not correct then use instead the lexical form enclosed in angle brackets.
>>>>>>> this is basically the mapping L defined at https://www.w3.org/2021/12/rdf-star.html#mapping
>>>>>>>> Given a set of recognized datatypes, the mapping * from RDF-star graphs to RDF
>>>>>>>> graphs is defined as follows:
>>>>>>>>
>>>>>>>> Pick some embedded triple ( s, p, o ) such that none of s, p, and o are triples, replace all occurrences of the triple by a fresh blank node b, and add the triples
>>>>>>>>
>>>>>>>>    ( b, rdf:type, rdf:Statement )
>>>>>>>>    ( b, rdf:subject, s )
>>>>>>>>    ( b, rdf:stated-subject, L(s) ) if s is not a blank node
>>>>>>>>    ( b, rdf:predicate, p )
>>>>>>>>    ( b, rdf:stated-predicate, L(p) )
>>>>>>>>    ( b, rdf:object, o ) if o is not a malformed literal
>>>>>>>>    ( b, rdf:stated-object, L(o) )  if o is not a blank node
>>>>>>>>
>>>>>>>> Finish when there are no embedded triples left.
>>>>>>> and this is very similar to the unstar mapping defined at https://www.w3.org/2021/12/rdf-star.html#mapping ,
>>>>>>> with two main differences
>>>>>>>
>>>>>>> - your proposal reuses the reification vocabulary, while the CG proposal defines a brand new vocabulary for this mapping. I don't think that this is a significant change -- although I get your point about making RDF reification more useful.
>>>>>>>
>>>>>>> - your proposal does not "escape" the "reification vocabulary" in the original graph -- which the CG proposal does. Did you omit it on purpose? This could have strange cons
>>>>>>>
>>>>>>> (aside remark: the "if o is not a malformed literal" was found to make the semantics non-monotonic, so we should probably not keep it: https://github.com/w3c/rdf-star/issues/262 )
>>>>>>>
>>>>>>>> An RDF-star graph G1 entails an RDF-star graph G2 in RDF-star iff G1* entails G2* in RDF.
>>>>>>> this is also how it is defined in the CG report (modulo the differences between the mappings): https://www.w3.org/2021/12/rdf-star.html#entailment-of-rdf-star-graphs
>>>>>>>
>>>>>>>    best
>>>>>>>
>>>>>>>> Yours sincerely,
>>>>>>>>
>>>>>>>> Peter F. Patel-Schneider
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
Received on Friday, 16 December 2022 17:17:16 UTC