Re: A proposal for basing quotation and annotation upon blank graphs from Pierre-Antoine Champin on 2023-10-09 (public-rdf-star-wg@w3.org from October 2023)

From: Pierre-Antoine Champin <pierre-antoine@w3.org>
Date: Mon, 9 Oct 2023 18:32:34 +0200
To: Niklas Lindström <lindstream@gmail.com>
Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <18d40b80-2d97-4337-89b2-e35d80f5223a@w3.org>
Dear Niklas, all,

After reading Niklas' document, and following the conversations during 
the last semantic TF, I had a few ideas that I would like to submit to 
the group. I believe that these proposal provide a good basis for :

* addressing the issues raised by Niklas in his documents, as well as by 
others in the past, about how RDF-star was defined by the CG
* addressing the question of the relationship between named graphs and 
quoted triples
* addressing TimBL's suggestion to move towards N3, without violating 
our charter :-)


*Proposal 1 : change the meaning of the annotation syntax*

TL/DR: quoted triples are still "unique" (types), but the annotation 
syntax now introduces tokens.

We would introduce a new property (named `rdf:tokenOf` in this email, 
but I would rather find a better name).
The annotation syntax in Turtle :
     S P O {| A B |}.
would now translate to
     S P O.
     << S P O >> rdf:tokenOf [ A B ].

Niklas convinced me here 
<https://gist.github.com/niklasl/c22994e664663b6730613ecc1321c418#conflating-the-type-and-the-token> 
that the different kinds of tokens/occurrences could be captured by 
providing types to the blank node(s), so we can have a single implicit 
predicate linking the quoted triple to its multiple tokens. As such, the 
annotation syntax becomes less error-prone, but the abstract syntax 
remains unchanged.


*Proposal 2 : propose a profile for datasets with semantics*

There is a good reason why the previous WG refrained from defining a 
semantics for datasets: they are used in heterogeneous way in the wild, 
so any specific semantics would make many deployments non-compliant.

However, we can introduce a /profile/ for RDF 1.2 where dataset have a 
formal semantics. Any SPARQL endpoint or TriG file that complies with 
this profile could advertise it (otherwise, all bets are off). 
Content-types based on JSON-LD (e.g. application/vc+ld+json) could 
specify that they comply with this profile.

NB: contrarily to the proposals I wrote a weeks weeks ago 
<https://hackmd.io/_kG54skVQle0KT2S85Tj1Q>, there is no restriction on 
the dataset that can comply with this profile.

See proposal 3 below about how we would define that semantics.
Note however that Proposal 2 could be accepted without accepting Proposal 3.


*Proposal 3: a possible way to define the semantics of datasets
*

This proposal has proposals 1 and 2 above as prerequisite.

First, we need to extend the abstract syntax : instead of allowing 
single triples to be used as terms, we now allow graphs (i.e. sets of 
triples) to be used as terms (graph terms). We must assume that the 
semantics of RDF 1.2 graphs extend naturally to this new abstract syntax.

Second, we define the semantics of any dataset in the "semantic dataset" 
profile (from prop 2 above) as follow:
replace every named graph (name, graph) by the triple `name rdf:tokenOf 
graph`  (where rdf:tokenOf is the property defined in prop 1 above)
The result is an graph (possibly using graph terms as defined by the new 
abstract syntax), and can be interpreted using the semantics for graph.

Note that the graph terms introduced in the abstract syntax represent 
"graph types", while named graphs represent "graph tokens".

Notice that the following Turtle
```
:s :p :o {| :a :b |}.
```
is now equivalent to the following TriG (with the semantic-dataset profile):
```
:s :p :o.
_:g :a :b.
_:g { :s :p :o }.
```
But also that some uses of quoted triples can not be "flattened" to 
named graphs, e.g.:
```
:a :b << :s :p :o >>.
```
Conversely, some semantic datasets can not be expressed in Turtle, e.g.
```
:g1 { :s :p :o1, :o2 }
```

Finally, there are graphs in the abstract syntax that neither Turtle nor 
TriG can represent.
Here are two examples of such graphs, expressed in N3:
```
:a :b { :c :d :e, :f }.
```
and, tricker
```
:a :b { :c :d { :e :f :g } }.
```
I don't quite like that, but I then again, we already have a similar 
situation in RDF 1.1, where some consequences of the semantics can not 
be represented in RDF itself (e.g. `1 rdf:type rdf:Resource`). So I can 
live with that -- at least until RDF 1.3 :->

   pa


On 05/10/2023 14:02, Niklas Lindström wrote:
> I have written a follow-up (ttled "Prerequisites and Requirements for
> Quotation in RDF") to explore some fundamental questions raised by
> these proposals:
>
>      https://gist.github.com/niklasl/c22994e664663b6730613ecc1321c418

>
> I look forward to analyzing this further with you.
>
> All the best,
> Niklas
>
> On Thu, Sep 21, 2023 at 5:50 PM Niklas Lindström<lindstream@gmail.com>  wrote:
>> Pierre-Antoine,
>>
>> This is great! Your proposal is indeed less radical, which provides a
>> great balance here. I think we're converging on something.
>>
>> I did intentionally go the conservative route, staying as close to RDF
>> 1.1 as I could. I care deeply for existing RDF usage, including
>> JSON-LD (which is more convenient in handling named (including blank)
>> graphs; something the proposal would bring to TriG). At the same time,
>> I push forward on the semantics for graphs question, because, as I
>> also said during TPAC, even if it's beyond the Charter to *define* it,
>> we must ensure we do not make it *harder* to define it in the future.
>> And my proposal *attempts* to leave that door open without forcing us
>> to go through it...
>>
>> Taking momentum and uptake into consideration, I do think my proposal
>> also provides the leverage that the "PG-crowd" is after (with
>> "occurrences"). Admittedly, it does not cater that much for the
>> "N3-crowd" and their use of graph terms as "types". Herein lies a crux
>> and the key difference (which you pointed out to me yesterday): my
>> proposal claims that named graphs are (token) occurrences of graphs.
>>
>> ## Graphs as Types or Tokens
>>
>> You did also point out that in Pat Hayes' BLogic talk [1] (a seminal
>> talk for many of us, I believe), he specifically talks about the
>> missing type/token distinction (slide 20; look at slides 18-23 for
>> more context)! I too think that illuminates something essential.
>>
>> Because I do think graphs, in RDF documents, are "depictions". In our
>> last Semantics TF telecon, where I tried to defend the "graphs are
>> types" position, Enrico pointed out the open world; and at first I
>> didn't see it (thinking "a belief is a belief"), but unless I
>> misinterpret that point, it's about "some claims from two different
>> bigger sets of claims". (And I think this view is possible to hold
>> *without* requiring opacity; but I digress.) Or, perhaps: two
>> identical descriptions from two sources may have different *senses*;
>> see the Gettier problem [2], which:
>>
>>> illustrate[s] by means of two counterexamples that there are cases where individuals can have a justified, true belief regarding a claim but still fail to know it because the reasons for the belief, while justified, turn out to be false
>> Or: two graphs may carry the identical set of triples but have come
>> about in two different ways. (Or is it that two sets of triples may
>> represent one graph?)
>>
>> The point is: when we "describe graphs", do we talk about the
>> reference ("une pipe") or the picture? With Named Graphs, we talk
>> about the picture! Even more so with blank graphs (*some* picture).
>> But the "picture itself" depicts the value/meaning of the graph (which
>> is why the empty graph is True).
>>
>> ## Graph Terms
>>
>> I do think perhaps one problem in my proposal is that graphs in the
>> object position *look* like types, at least for N3 users; just as the
>> "<< ... >>" form of quoted triples most definitely *look* like some
>> kind of compound IRIs! I try to cautiously approach the possibility of
>> "talking about the type" of compound statements in RDF. I say that
>> fully standing behind the fact that IRIs are references, *not* senses
>> (again, Frege [3]). But *use* of them is.
>>
>> It would be very odd to introduce types for triples but continue to
>> leave the graphs in type/token limbo! *But* I think we're circling
>> around something we can agree on.
>>
>> If I indulge the "type position", starting with the N3 form naming a
>> "graph type:
>>
>>      _:ex1 sem:quotedGraph { <p1> :birthDate "1902" } .
>>
>> I can see how proposing that exact notation in TriG and saying that
>> the object itself here is a blank graph *token* can be a problem. I
>> seem to "co-opt" what is a type in N3 as a blank graph *token* in
>> TriG? Of course, this depends on the relationship as well. ("Type or
>> token enabling properties" ...)
>>
>> Instead of sem:quotedGraph we could imagine using skos:broadMatch, or
>> some ex:broaderInstantive. Or be audaciously bold and (you may want to
>> put on sunglasses when viewing this one) say:
>>
>>      _:ex2 rdf:type { <p1> :birthDate "1902" } .
>>
>> (I need to think hard about that of course, and test it out thoroughly.)
>>
>> In any case, we should probably not define bnode names and IRI names
>> for graphs on opposite sides of the type/token distinction. If graph
>> names denote graph *tokens*, that would be a clear explanation for why
>> names don't denote the graphs "themselves" (as types).
>>
>> (While it is another kind of conflation to equate graph tokens with
>> their serializations, that is a conflation we perhaps do all over RDF
>> with literals. We "get out" of that by adding D-entailment? I thus
>> still think that in TriG (and JSON-LD), graphs are indeed tokens, but
>> that inference may "enter the value space", and thus, possibly, go
>> from token to type.)
>>
>> ## Triple Terms
>>
>> That I excluded the "<< ... >>" form was mainly since it is not needed
>> in my proposal. It introduces a difference without a (yet defined)
>> difference. Building on the above, we could bring that back, perhaps
>> to denote the triple type itself; but not before the above is
>> resolved. The *biggest* problem with it is its "recursive token"
>> notion. (As I wrote in the proposal, I think it is adding complexity
>> in the very substrate of RDF, which is supposed to be simple.)
>>
>> But if this:
>>
>>      << <p1> :birthDate "1902" >> :date "2023-09-21" .
>>
>> simply means this:
>>
>>      <urn:tdb:2014:urn%3Amd5%3Aabc6d701a21e16840530bd64d23d56ba> :date
>> "2023-09-21" .
>>      <urn:tdb:2014:urn%3Amd5%3Aabc6d701a21e16840530bd64d23d56ba>
>> owl:sameAs { <p1> :birthDate "1902" } .
>>
>> Where that URI  *denotes* the graph type (made from a checksum of the
>> canonical ntriples-form of that triple); that's another thing. Then we
>> can relate it to occurrences:
>>
>>      <urn:tdb:2014:urn%3Amd5%3Aabc6d701a21e16840530bd64d23d56ba>
>> :hasInstance _:ex1, _:ex2 .
>>
>> This is off the top of my head; needing time to assess whether it
>> makes sense and is cohesive and usable.
>>
>> Cheers,
>> Niklas
>>
>> [1]:https://www.slideshare.net/PatHayes/blogic-iswc-2009-invited-talk

>> [2]:https://en.wikipedia.org/wiki/Gettier_problem

>> [3]:https://en.wikipedia.org/wiki/Sense_and_reference

>>
>>
>>
>>
>> On Wed, Sep 20, 2023 at 10:06 PM Pierre-Antoine Champin
>> <pierre-antoine@w3.org>  wrote:
>>> As an amusing coincidence, Tim's rant during the TPAC F2F (and the
>>> discussions that followed) also got me thinking, and I wrote down a few
>>> things here:
>>>
>>> https://hackmd.io/_kG54skVQle0KT2S85Tj1Q
>>>
>>> The approach is slightly less radical than Niklas' . I tried to explore
>>> various ways to support N3-like graph-terms, by breaking more or less
>>> (and hopefully less...) what we already have.
>>>
>>>
>>> On 20/09/2023 19:41, Niklas Lindström wrote:
>>>> Dear all,
>>>>
>>>> I have written a proposal for basing quotation and annotation upon
>>>> "blank graphs'" (graphs named by bnodes):
>>>>
>>>>       https://gist.github.com/niklasl/4f52c32ef2d888c172c8584e36c24610

>>>>
>>>> While I am rather convinced (by the various concerns raised) that this
>>>> is the right direction, it represents a fairly drastic change (for
>>>> RDF-star, that is; not for RDF 1.1). I did not approach it lightly,
>>>> having thought long and hard about where we are, and (carefully, I
>>>> hope) weighing alternatives, for many months, if not years.
>>>>
>>>> If there is interest, I can introduce it during the WG or Semantics TF
>>>> telecon this week.
>>>>
>>>> Best regards,
>>>> Niklas
>>>>
Attachments

application/pgp-keys attachment: OpenPGP public key
Received on Monday, 9 October 2023 16:32:39 UTC