Re: summary un/asserted from Franconi Enrico on 2024-07-10 (public-rdf-star-wg@w3.org from July 2024)

From: Franconi Enrico <franconi@inf.unibz.it>
Date: Wed, 10 Jul 2024 13:57:56 +0000
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
CC: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <5631E074-7069-46B0-BFED-3A9EA6CED975@inf.unibz.it>
OK, I get it - your proposal starts at the level of the abstract data model in the language of sets and triples - not syntax.
And the syntax I provide is adequate for your notion of RDF graphs.
—e.

> On 10 Jul 2024, at 15:50, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
> 
> No.  Syntax is syntax.  A proposal for a change to RDF graphs has to instead talk in the language of sets and triples - not syntax.   As an example of the difference, in your abstract syntax a graph is a sequence of triples and there is no notion there that a graph is instead a set of triples.
> 
> This is similar to the difference between the syntax of JSON and the majority meaning of JSON - the syntax doesn't say anything about repeated names in objects but the majority meaning is that an object is a finite map and keeps only the last object member with a given name.
> 
> The syntax you provide looks to be adequate for this notion of RDF graphs (aside from unused atomicTerm production) but could be used for lots of different structures.
> 
> peter
> 
> 
> On 7/10/24 09:29, Franconi Enrico wrote:
>> Peter,
>> sure, but would you agree that your idea would be captured exactly by the ABSTRACT syntax for RDF graph I wrote?
>> —e.
>>> On 10 Jul 2024, at 14:02, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>>> 
>>> The proposal is not about changing the syntax of RDF but instead about fundamentally changing the nature of RDF graphs (which will, in turn, require a new syntax, but that's not the important part).
>>> 
>>> Here is a quick stab at the required definition, done for generalized RDF-star graphs as that is the simplest version to do.
>>> 
>>> Generalized RDF-star triples is the smallest set of triples of the form subject, predicate, object where a subject, predicate, or object is an IRI, a blank node, or a literal, optionally plus a generalized RDF-star triple.
>>> 
>>> The optional triple might have to instead be a set of triples.
>>> 
>>> A generalized RDF-star graph is a set of generalized RDF-star triples.
>>> 
>>> 
>>> peter
>>> 
>>> 
>>> 
>>> 
>>> On 7/10/24 04:21, Franconi Enrico wrote:
>>>> If I understand you well, you propose that RDF has the following syntax:
>>>> |graph ::= triple* triple ::= subject predicate object subject ::= NoLiteralTerm predicate ::= iri object ::= term NoLiteralAtomicTerm ::= iri | BlankNode atomicTerm ::= NoLiteralAtomicTerm | literal NoLiteralTerm ::= NoLiteralAtomicTerm | tripleTerm term ::= NoLiteralTerm | literal tripleTerm ::= |NoLiteralAtomicTerm triple
>>>> Am I correct?
>>>> —e.
>>>>> On 9 Jul 2024, at 23:09, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>>>>> 
>>>>> The point of the proposal is to require that (some) nodes in RDF graphs are of the form IRI x triple or BNOde x triple.
>>>>> 
>>>>> Yes, Turtle should be as compact as possible but it is not the thing that most users should see why they view RDF graphs.
>>>>> 
>>>>> peter
>>>>> 
>>>>> 
>>>>> On 7/9/24 15:12, Niklas Lindström wrote:
>>>>>> Hi Peter,
>>>>>> I agree with your initial reply to Thomas. And I agree that your
>>>>>> (strawman) proposal here probably won't hold up.
>>>>>> This form looks like named triples (RDFn). I don't think it would
>>>>>> work. unless RDF graphs are redefined to be `(triple* | (name,
>>>>>> triple))*`. It also imposes some troubling limitations, such as the
>>>>>> impossibility of referring to the relationship between the "name" and
>>>>>> the triple (not only in other triple terms, which may be an edge case;
>>>>>> but, crucially, in vocabulary design; which is needed, as I show in
>>>>>> [4]). And it may lead to the named graphs problem all over again --
>>>>>> what do the names mean in relation to their triple(s)? And indeed,
>>>>>> naming multiple triples like that appears very problematic. (Problems
>>>>>> which the explicit reification of multiple triples by linking them
>>>>>> does not suffer from.)
>>>>>> I suspect that some ongoing confusion is a residual effect of the
>>>>>> original proposal to add triples as subjects. Adding triples as
>>>>>> subjects was *not* reification "done right". It was, IMO, reification
>>>>>> done more wrong. Triples as subjects didn't work at all for real world
>>>>>> LPG uses of many-to-one. With some hyperbole, it was akin to using
>>>>>> literals as subjects naively, with `"20" :currency :USD` to solve the
>>>>>> problem of values with units (structured values), but "with some
>>>>>> limitations" (saying that the integer 20 is in US-dollar currency in
>>>>>> the entire model). But to be more fair, the RDF-star error was far
>>>>>> more subtle.
>>>>>> We've finally all but expunged this error. Now, triples as *objects*
>>>>>> (triple terms) of an appropriate relation on the other hand, have
>>>>>> shown promise of some really powerful benefits.
>>>>>> There is some residue left though, one being some insistence on
>>>>>> allowing it even in non-generalized abstract syntax. But another
>>>>>> problem is sticking to this syntax:
>>>>>>     << <Alice> :bought <SomeComputer> >> :date "2014" .
>>>>>> Which is now a shorthand for:
>>>>>>     _:r1 rdf:reifies <<( <Alice> :bought <SomeComputer> )>> .
>>>>>>     _:r1 :date "2024" .
>>>>>>     _:r1 :cost 20 .
>>>>>>     _:r1 :currency :USD .
>>>>>> and totally fails to make this:
>>>>>>     _:r1 rdf:reifies <<( <Alice> :shoppedAt <ComputerStore> )>> .
>>>>>>     _:r1 rdf:reifies <<( <Alice> :bought <SomeComputer> )>> .
>>>>>>     _:r1 :date "2024" .
>>>>>>     _:r1 :cost 20 .
>>>>>>     _:r1 :currency :USD .
>>>>>> shorten to anything like Turtle, or even legible at all:
>>>>>>     << _:r1 | <Alice> :bought <SomeComputer> >> :date "2014" .
>>>>>>     << _:r1 | <Alice> :shoppedAt <ComputerStore> >> :cost 20 .
>>>>>>     _:r1 :currency :USD .
>>>>>> (In case anyone wants to object to my model design choice here ("use
>>>>>> `_:r1 :seller <ComputerStore>`"!), please read my follow-up to Thomas
>>>>>> [1].)
>>>>>> If we're *serious* about the minimal baseline [2], with `rdf:reifies`
>>>>>> working *equally* well for many-to-one and many-to-many (proper N-ary
>>>>>> relationships, relators, general reification), we need to revisit that
>>>>>> in earnest, as I wrote in [3].
>>>>>> That proposal could shorten the above--if the purchase alluded to is
>>>>>> not also true--along the lines of:
>>>>>>     <Alice> << :bought <SomeComputer> >> ^{_:r1} ;
>>>>>>         << :shoppedAt <ComputerStore> >> ^{_:r1} .
>>>>>>     _:r1 :cost 20 ;
>>>>>>         :currency :USD ;
>>>>>>         :date "2014" .
>>>>>> Which might not be *beautiful* (and could be tinkered with some more),
>>>>>> but is at least more "Turtle" (once you get used to reading the quotes
>>>>>> as being for predicate+object). For the possibly (much) more common
>>>>>> case, remove the quotes to have the regular assertions with
>>>>>> annotations:
>>>>>>     <Alice> :bought <SomeComputer> ^{_:r1} ;
>>>>>>         :shoppedAt <ComputerStore> ^{_:r1} .
>>>>>>     _:r1 :cost 20 ;
>>>>>>         :currency :USD ;
>>>>>>         :date "2014" .
>>>>>> This "extra resource" is *crucial*. And it isn't anything mysterious.
>>>>>> Here, it should be typed: `_:r1 a :Purchase`. In other cases, we have
>>>>>> Marriages, Publications, Pipe connections, or good old Statements,
>>>>>> Snaks, Observations, Utterances, Data Sources or Ingests, or whatever
>>>>>> the nature is of the reifying circumstance of one or more abstract
>>>>>> relationships. Regardless of their type, they relate to these
>>>>>> relationships, uniformly, with `rdf:reifies`. And this is what we
>>>>>> should convey.
>>>>>> I very much value what you wrote regarding "the limited sensory and
>>>>>> cognitive capabilities of humans". Even if my proposed form here is
>>>>>> deemed unsatisfactory, this is the condition for which I think Turtle
>>>>>> should cater. Making wikidata more readable is of great interest to me
>>>>>> too [4]. Again, the detailed polish has to wait until we have a solid,
>>>>>> agreed upon baseline. (There is some interaction though, unless
>>>>>> someone can transmit the pure qualia of the RDF abstract syntax...)
>>>>>> Best regards,
>>>>>> Niklas
>>>>>> [1]: <https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jul/0038.html>
>>>>>> [2]: <https://github.com/w3c/rdf-star-wg/wiki/RDF-star-%22minimal-baseline%22>
>>>>>> [3]: <https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Jul/0011.html>
>>>>>> [4]: <https://github.com/Kungbib/wikidatalab/>
>>>>>> On Tue, Jul 9, 2024 at 5:02 PM Peter F. Patel-Schneider
>>>>>> <pfpschneider@gmail.com> wrote:
>>>>>>> 
>>>>>>> Here is a proposal that I don't think will go anywhere, and I might not
>>>>>>> totally believe in, but does connect to the working group's activities.
>>>>>>> 
>>>>>>> THESIS:  embedded triples are not a good solution to the use cases of the
>>>>>>> working group
>>>>>>> 
>>>>>>> EVIDENCE:
>>>>>>> 
>>>>>>> The use cases of the working group do not use embedded triples directly but
>>>>>>> instead require a separate resource that is connected to a triple.   These
>>>>>>> separate resources are needed because the information about an embedded triple
>>>>>>> from one use of it has to be separated from the information from other uses.
>>>>>>> Otherwise there is a mix-and-match problem, as shown in representing
>>>>>>> provenance where source from one provenance cannot be combined with time or
>>>>>>> access from another.  This problem affects the "seminal example", all kinds of
>>>>>>> provenance, and nearly all uses of embedded triples in the enoding of n-ary
>>>>>>> predicates.  The need for this extra resource and new linking predicate add to
>>>>>>> the complexity of just about any use of embedded triples in RDF and require
>>>>>>> extra shorthands in Turtle to partly hide this complexity from users.
>>>>>>> 
>>>>>>> SOLUTION:
>>>>>>> 
>>>>>>> The solution is to do away with the uniqueness of embedded triples and base
>>>>>>> the extension of RDF proposed by the working group instead on non-unique
>>>>>>> occurrences of triples.   If we leave the proposed syntax alone, we get an
>>>>>>> extension of RDF where
>>>>>>>    << :a :b :c >> :d :e , :f :g .
>>>>>>>    << :a :b :c >> :h :i , :j :k .
>>>>>>> does *not* entail
>>>>>>>    << :a :b :c >> :d :e , :h :i .
>>>>>>> 
>>>>>>> There are problems with this version of occurrences of triples.   Without some
>>>>>>> way of referencing a particular occurrence of a triple it is not possible to
>>>>>>> represent the above graphs in N-triples and all information about the
>>>>>>> occurence has to use a shorthand syntax in Turtle, making what used to be a
>>>>>>> convenience a necessity.   The solution to this problem is to in effect give
>>>>>>> these resources an identifier, so that a particular occurrence of a triple is
>>>>>>> no longer "anonymous" and can be referred to.
>>>>>>> 
>>>>>>> The way to do this is to allow IRIs and blank nodes in RDF to also be a triple
>>>>>>> occurence, with syntax something like (this syntax probably not good at all
>>>>>>> but you should get the idea)
>>>>>>>    <:x< :a :b :c >> :d :e .
>>>>>>>    <_:x< :a :b :c >> :d :e .
>>>>>>> in both N-triples and Turtle.  This is a varation of a recent syntax proposal
>>>>>>> but is not just syntax and instead is the extension to the RDF data model to
>>>>>>> support quoted triples.
>>>>>>> 
>>>>>>> A big problem (and one reason that I don't totally believe this proposal) is
>>>>>>> using the same IRI or blank node for multiple triple occurrences as in
>>>>>>>    <:x< :a :b :c >> :d :e .
>>>>>>>    <:x< :f :g :h >> :d :e .
>>>>>>> has to be handled by either forbidding it or allowing a node to have multiple
>>>>>>> triple occurrences.
>>>>>>> 
>>>>>>> peter
>>>>>>> 
>>>>>
Received on Wednesday, 10 July 2024 13:58:04 UTC