Re: Consolidating triple/edges -- occurrence set version

> On 11. Jan 2024, at 13:53, Andy Seaborne <andy@apache.org> wrote:
> 
> 
> 
> On 11/01/2024 11:02, Thomas Lörtsch wrote:
>> Hi Andy,
>>> On 10. Jan 2024, at 23:46, Andy Seaborne <andy@apache.org> wrote:
>>> 
>>> This is a variation of the consolidated triple/edges proposal.
>>> 
>>> An issue with the "named occurrences" variation of the consolidate proposal is that given a name part of a named occurrence, there is no way to find out what the subject/predicate/object of the occurrence.
>> I don’t get this. If
>>     << :s :p :o >>         # an RDF-star CG-report triple term
>> is a triple term as defined in the CG report, and rdfx:occurrenceOf is a relation between such a term and the name of its occurrence in the data (as a regular RDF triple - referentially transparent, naturally)
>>     :s :p :o .             # an RDF triple
>> then
>>     :X rdfx:occurrenceOf << :s :p :o >> .
>> describes an occurrence of that triple term as an actual triple [0].
>> SPARQL-star allows to query for the properties of :X, namely for the subject, predicate and object of the triple it describes. So why do you say that "there is no way to find out what the subject/predicate/object of the occurrence" is?
> 
> In the variant [1] with names occurrences as RDF terms,
> << n | s p o >> the data can use the name, n,
> 
> << n | s p o >> :q1  :z1 ; :q2  :z2 .
> 
> is the same as
> 
> << n | s p o >> :q1  :z1 .
> n :q2  :z2 .
> 
> There's no occurrenceOf triple. Having a virtual one has been suggested.

Okay, now I understand. I mixed up different stages of the proposal. I just produced a very basic proposal in [3]. The purpose of that proposal is to point out all the basic issues : referential transparency vs opacity, the missing link between triple and annotation, facts vs claims. Of course maybe there’s more. 

AFAICT the "named occurrence" is syntactic sugar from an orthogonal perspective: it doesn’t interact with the three problems I describe in [3]. It replaces the triple defining the occurrence name, but at the cost of 
- flexibility (note how many properties I define in [3] to capture the different needs and perspectives)
- implementation (it needs a quad index, new operators in SPARQL)
- proliferation (!!maybe!! the NNG proposal is guilty of that for sure, but with graphs I find it more urgent to avoid repetition in stating something. with triples I’m not so sure)

Maybe the named occurrence is worth it - I don’t want to rule that out prematurely - but currently I’m rather concerned about the more fundamental issues in [3].


Thomas

> The problem is that << n | s p o >> both declares a term with 4 parts and provides the name. The name on its own refers to the "occurrence".
> 
> Now is we have just this triple - n might be a URI -
> 
>   n :q3 :z3 .
> 
> A application then might want to find the subject is s.
> 
> [1] does not have a real rdf:occurrenceOf triple. The mixing of declaration and use of name is the problem - there is a projection which looses information.  A formal set of edges is another approach; one of my points about that is that an RDF graph is then no longer just a set of triples.
> 
> [2] explores putting back some kind of RDF term that is the 3-tuple, which is not a "triple term", to put back a real occurrenceOf relationship.
> 
> SPARQL-star has SUBJECT(...) and to me that seems like a natural expectation to have such a function. A virtual triple does not address this unless the declaration is available out-of-band.
> 
>    Andy
> 
> [1] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0000.html
> 
> [2] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0050.html

[3] https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jan/0062.html


>> Best,
>> Thomas
>> [0] There is a problem with the indirection inherent to this approach: we don’t know if the occrrences describes a triple that actually is asserted in the data. Even if the triple referred to via the triple term actually occurs in the data, it just represents the type. The occurrence described by :X might yet (intend to) refer to an unasserted occurrence. The Nested Named Graph approach avoids that problem.
>>> There have been suggestions of having virtual property rdf:occurrenceOf (it can not appear in data) or adding an edge set into the RDF data model (a graph is no longer just  a set of triples).
>>> 
>>> In this variation, the RDF abstract data model has "occurrences sets" as RDF terms.
>>> 
>>> An "occurrence set" for S,P,O is the set of all named occurrences that have S,P,O in those positions. There is one occurrence set for every triple.
>>> 
>>> A named occurrences is a pair of (name, occurrence) where "occurrence" is member of an occurrence set.
>>> 
>>> Occurrence sets replace triple terms.
>>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0033.html
>>> 
>>> Named occurrences are not part of the RDF data model (abstract syntax).
>>> 
>>> The syntax <<[ ]>> is used below for now to be different to triple term <<( )>>. Had that not been used already, it would be better as <<( )>> because () is often used for tuples.
>>> 
>>> An implementation that wishes have great named occurrence performance can have data structure for (n,s,p,o) with indexed lookup operations.
>>> 
>>> ## Turtle and N-Triples.
>>> 
>>>    << _:n | s p o >> :q :z .
>>> 
>>> is a syntax form and is equivalent to the N-triples:
>>> 
>>>    _:n rdf:occurrenceOf <<[ :s :p :o ]>> .
>>>    _:n :q :z .
>>> 
>>> "memberOf" or variants on "member" don't look good because rdfs:member already exists.
>>> 
>>> Now given a name "n" (blank node or URI) found by some means, then
>>> 
>>>    _:n rdf:occurrenceOf ?X .
>>> 
>>> finds the occurrence set term, which has the subject/predicate and object.
>>> 
>>> Annotation syntax applies as before.
>>> 
>>> An RDF graph is a set of triples.
>>> There is no need for a virtual property.
>>> 
>>> It may work to have a class of occurrences for S/P/O where the occurrence set is the class extension.
>>> 
>>>    Andy
>>> 
>>> 
> 

Received on Thursday, 11 January 2024 13:49:39 UTC