Re: [External] : Re: RDF is a framework, not a vocabulary from Gregg Kellogg on 2024-07-12 (public-rdf-star-wg@w3.org from July 2024)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Fri, 12 Jul 2024 13:46:14 -0700
To: Kurt Cagle <kurt.cagle@gmail.com>
Cc: Souripriya Das <souripriya.das@oracle.com>, Thomas Lörtsch <tl@rat.io>, RDF-star WG <public-rdf-star-wg@w3.org>
Message-Id: <69797F10-0A01-4E7F-A89C-6BB26131C674@greggkellogg.net>
> On Jul 12, 2024, at 1:22 PM, Kurt Cagle <kurt.cagle@gmail.com> wrote:
> 
> I'm going to continue the discussion I made this last Thursday.
> 
> A reification is the definition of an anonymous node of the form:
> 
> (1) [ rdf:subject :s ; rdf:predicate :p ;  rdf:object :o ; rdf:graph :g ]
> 
> Which can be expanded as:
> 
> (2) _:b1 rdf:subject  :s ;
>          rdf:predicate :p ;
>          rdf:object :o ;
>          rdf:graph :g ;
>          .
> 
> For ease of typing (and as a recommendation) I would recommend that rdf:subject, etc., be shortened to rdf:s, rdf:p, etc.
> 
> (3) _:b1 rdf:s  :s ;
>          rdf:p :p ;
>          rdf:o :o ;
>          rdf:g :g ;
>          .
> 
> My argument for named node expressions is that these type of TURTLE expressions (using bracket notation, as in (1)) represent a type of node called an anonymous node. An anonymous node differs from a blank node in that a blank node is in essence locally defined to a given graph, but is still named relative to that particular graph (the system in essence creates an arbitrary substitute name when the expression is parsed, whereas an anonymous graph, as defined, is not named or referenceable. in Turtle (though it can be in SPARQL via indirection).

The abstract syntax [1] defines Blank Nodes only. Some (all?) concrete syntaxes have a way to associated an identifier with a blank node, which is locally scoped. So, your notion of an anonymous node would be a blank node that was not given an identifier in the concrete syntax. Certainlly `[]` (the ANON terminal in Turtle, or blankNodePropertyList) is one such unidentified blank node, and in RDFA `_:` is also unidentified.

There are been previous proposals for being able to identify a blankNodePropertyList with some identified blank node or IRI. Your proposed syntax shown below is certainly suitable for doing this. It also addresses some problems with the annotation syntax (e.g., `:s :p :o {| :namedNode => :p1, :o1 |}`) where we were using `|` previously and creating a confusion with SPARQL Property Path syntax.

> This assumes that there are no specific semantics associated here - it's a parsing issue with Turtle that you have a node that is in fact not referenceable at all, and I believe that it underlies many of the issues that we are currently facing. 
> 
> In my proposal, I recommend a syntax (covered here) that makes it possible to name an anonymous node, whether with a local identifier (a bnode) or with a global identifier (an IRI). The syntax looks as follows:
> 
> :s :p [ :namedNode => :p1 :o1; :p2 :o2 ]
> 
> This has the effect of creating a new node :namedNode
> :s :p :namedNode .
> :namedNode :p1 :o1  .
> :namedNode  p2 :o2 ;
> 
> Why is this important?
> 
> We have a similar notation for reifications:
> 
> :s :p << s1 :p1 :o1 >>. 
> 
> Here the reification is an anonymous node - it has no formal identifier, whether global or local, and consequently can only be represented positionally within Turtle. You can, under the current recommendations, create an alternate name for that reification:
> 
> :s :p << :r | :s1 :p1 :o1 >>.
> 
> This can also be expressed with a named node expression:
> 
> ::s :p  [ :r => rdf:s :s1 ; rdf:p :p1 ; rdf:o :o1] .
> 
> Both are equivalent to 
> 
> :s :p  :r .
> :r rdf:s :s1 ; rdf:p :p1 ; rdf:o :o1 .
> 
> where :r is the reifier name.

Whether << :r | :s1 :p1 :o1 >> and [ :r => rdf:s :s1 ; rdf:p :p1 ; rdf:o :o1] are “equivalent” is debtable, but certainly the later can be considered as being entailed by the former. The arguments for using the (more) atomic version of a triple term <<( :s :p :o )>> were based on considerations of getting complete results from SPARQL queries, where some parts of the expression might not be included due to the effect of LIMIT, or similar. That’s somewhat weakened by the use of a reifier. We do need to be able to derive a version of a graph which does not use triple terms, which we’ve called “Classic conformance” via an “unstar” mapping. For example, this will be necessary for RDF Dataset Canonicalization.

Gregg

> There are many advantages to this notation (as described in my proposal at https://github.com/w3c/rdf-star-wg/wiki/Proposal:-Named-Node-Expressions.
> 
> It provides a way of identifying a statement that can be either local to the graph or in a global space. 
> It sidesteps the whole semantics of treating a reification as a specialized object: it is, but it's one that can be expressed as an extant RDF structure (graph) without changing RDF itself. It does require a new extension to the Turtle language, however.
> A reification of a triple becomes a simple annotation, and that triple represented by the reification does not have to exist within the associated graph.
> It simplifies the notation, making it possible to create structures that emulate Neo4J specific statements.
> It makes possible local predicates (another feature of Neo4J).
> The same (reified triple) can be referenced by multiple reifiers for different needs, without getting into the whole issue of hypergraph structures (multiple subjects)
> Put another way, the reified triple structure is not itself a triple - it makes no assertions within the graph but rather describes a hypothetical triple.
> Given a reified triple expression in this form, a SPARQL script could trivial create a construct statement that converts the expression into a triple in the graph and vice-versa - turning a triple into an RT Expression.
> It also expands readily to accommodate graph structures:
> Examples and use cases can be seen at https://github.com/w3c/rdf-star-wg/wiki/Proposal:-Named-Node-Expressions. 
> 
> Kurt Cagle
> Editor in Chief
> The Cagle Report
> kurt.cagle@gmail.com <mailto:kurt.cagle@gmail.com>
> 443-837-8725 <http://voice.google.com/calls?a=nc,%2B14438378725>
> 
> On Fri, Jul 12, 2024 at 6:49 AM Souripriya Das <souripriya.das@oracle.com <mailto:souripriya.das@oracle.com>> wrote:
>> Hi Thomas,
>> 
>> > The technical argument is: it would be non-monotonic if you could annotate a triple with the remark that it is not asserted.
>> 
>> I'd say that that "not asserted" in the "remark" is only interpreted in the context of the domain or application that the data creator is modeling ... it has nothing to do with RDF's notion of assertion of a triple.
>> 
>> Thanks,
>> Souri.
>> 
>> 
>> From: Thomas Lörtsch <tl@rat.io <mailto:tl@rat.io>>
>> Sent: Friday, July 12, 2024 4:27 AM
>> To: Souripriya Das <souripriya.das@oracle.com <mailto:souripriya.das@oracle.com>>
>> Cc: RDF-star WG <public-rdf-star-wg@w3.org <mailto:public-rdf-star-wg@w3.org>>
>> Subject: [External] : Re: RDF is a framework, not a vocabulary
>>  
>> 
>> 
>> > On 12. Jul 2024, at 10:18, Souripriya Das <souripriya.das@oracle.com <mailto:souripriya.das@oracle.com>> wrote:
>> > 
>> > I am just wondering if the recent wave of discussions is taking us beyond the "framework" focus of RDF over to the territory of new vocabularies that can potentially be created on top of the enhanced framework of RDF. 
>> > 
>> > Since the goal of our WG is to determine the essential extensions to the framework in current RDF that will be critical for enabling and simplifying the target capabilities -- statements about statements and support for duplicate triples, concerns about issues that are more pertinent to development of interesting vocabularies on top of RDF1.2 (similar to SKOS on top of RDF1.1) should be avoided, IMHO. 
>> > 
>> > As long as RDF1.2 allows association of a term with a triple (or a block of triples, in case of many-to-many), data creators can designate such a term to belong to custom classes – :Relation, :Reification, :Myth, :Nonsense, etc. –
>> 
>> Those are very different terms. I agree that :Myth, :Nonsense, :Reported, :Endorsed, :etc are concepts that should be treated in ontologies on top of RDF. But if a statement is part of the graph, i.e. a triple, or if it’s only described but not asserted, i.e a reification, is an essential aspect that can’t be handled outside the core. 
>> That seems very intuitive to me. The technical argument is: it would be non-monotonic if you could annotate a triple with the remark that it is not asserted.
>> 
>> Best,
>> Thomas
>> 
>> 
>> >  that make sense in their domain. If there is a common set of such classes that are found to be important in many domains, enthusiasts can create vocabularies to capture those. Whether such classification determines if a "named" (put your favorite term here) triple (or block of triples) should be considered as "asserted" or not -- should be up to the vocabulary designers, not our WG.
>> > 
>> > Let us focus on the "framework" improvement part only and leave the vocabulary aspects to data creators and enthusiasts.
>> > 
>> > Hoping for timely and successful completion of RDF1.2 spec,
>> > Souri.
>> 
>>
Received on Friday, 12 July 2024 20:46:31 UTC