Re: [External] : Re: A proposal for basing quotation and annotation upon blank graphs from Souripriya Das on 2023-09-28 (public-rdf-star-wg@w3.org from September 2023)

From: Souripriya Das <souripriya.das@oracle.com>
Date: Thu, 28 Sep 2023 15:15:51 +0000
To: Niklas Lindström <lindstream@gmail.com>, Pierre-Antoine Champin <pierre-antoine@w3.org>
CC: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-ID: <SN4PR10MB56229ECE95D2D9A4DB10971EFAC1A@SN4PR10MB5622.namprd10.prod.outlook.com>
[resending after fixing some typos]

Trying to understand the new proposal which looks quite promising.

1) Suppose that we have the following data to start with:

@prefix : <tag:>.
graph :kg { :a :knows :b . :b :knows :c . :c :knows :d . }

The following SPARQL query returns ?cnt = 3:
PREFIX : <tag:>
SELECT (count(*) as ?cnt) { graph :kg { ?x :knows ?y } }

2) Next, we want to say something about the set of triples consisting of the second and third triple above. In RDFn, if we extend its naming concept from naming just individual triples to allow naming of triple-sets as well, we can use the name :n for the target triple-set:

@prefix : <tag:>.
graph :kg { :a :knows :b .
  :b :knows :c | :n .
  :c :knows :d | :n .
  :n :basedOn :webSearch .
}

Even with this changed data, the above SPARQL query would still return the correct result ?cnt = 3.

Question: How does the new proposal handle such a scenario?

Thanks,
Souri.
________________________________
From: Souripriya Das <souripriya.das@oracle.com>
Sent: Thursday, September 28, 2023 10:55 AM
To: Niklas Lindström <lindstream@gmail.com>; Pierre-Antoine Champin <pierre-antoine@w3.org>
Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>
Subject: Re: [External] : Re: A proposal for basing quotation and annotation upon blank graphs

Trying to understand the new proposal which looks quite interesting.

1) Suppose that we have the following data to start with:

prefix : <tag:>
graph :kg { :a :knows :b . :b :knows :c .:c :knows :d . }

The following SPARQL query returns ?cnt = 3:
SELECT (count(*) as ?cnt) { graph :kg { ?x :knows ?y } }

2) Next, we want to say something about the set of triples consisting of the second and third triple above. In RDFn, if we extend its naming concept from naming just individual triples to allow naming of triple-sets as well, we can use the name :n for the target triple-set:

prefix : <tag:>
graph :kg { :a :knows :b .
  :b :knows :c | :n .
  :c :knows :d | n .
  :n :basedOn :webSearch .
}

Even with this changed data, the above SPARQL query would still return the correct result ?cnt = 3.

Question: How does the new proposal handle such a scenario?

Thanks,
Souri.
________________________________
From: Niklas Lindström <lindstream@gmail.com>
Sent: Thursday, September 21, 2023 11:50 AM
To: Pierre-Antoine Champin <pierre-antoine@w3.org>
Cc: RDF-star Working Group <public-rdf-star-wg@w3.org>
Subject: [External] : Re: A proposal for basing quotation and annotation upon blank graphs

Pierre-Antoine,

This is great! Your proposal is indeed less radical, which provides a
great balance here. I think we're converging on something.

I did intentionally go the conservative route, staying as close to RDF
1.1 as I could. I care deeply for existing RDF usage, including
JSON-LD (which is more convenient in handling named (including blank)
graphs; something the proposal would bring to TriG). At the same time,
I push forward on the semantics for graphs question, because, as I
also said during TPAC, even if it's beyond the Charter to *define* it,
we must ensure we do not make it *harder* to define it in the future.
And my proposal *attempts* to leave that door open without forcing us
to go through it...

Taking momentum and uptake into consideration, I do think my proposal
also provides the leverage that the "PG-crowd" is after (with
"occurrences"). Admittedly, it does not cater that much for the
"N3-crowd" and their use of graph terms as "types". Herein lies a crux
and the key difference (which you pointed out to me yesterday): my
proposal claims that named graphs are (token) occurrences of graphs.

## Graphs as Types or Tokens

You did also point out that in Pat Hayes' BLogic talk [1] (a seminal
talk for many of us, I believe), he specifically talks about the
missing type/token distinction (slide 20; look at slides 18-23 for
more context)! I too think that illuminates something essential.

Because I do think graphs, in RDF documents, are "depictions". In our
last Semantics TF telecon, where I tried to defend the "graphs are
types" position, Enrico pointed out the open world; and at first I
didn't see it (thinking "a belief is a belief"), but unless I
misinterpret that point, it's about "some claims from two different
bigger sets of claims". (And I think this view is possible to hold
*without* requiring opacity; but I digress.) Or, perhaps: two
identical descriptions from two sources may have different *senses*;
see the Gettier problem [2], which:

> illustrate[s] by means of two counterexamples that there are cases where individuals can have a justified, true belief regarding a claim but still fail to know it because the reasons for the belief, while justified, turn out to be false

Or: two graphs may carry the identical set of triples but have come
about in two different ways. (Or is it that two sets of triples may
represent one graph?)

The point is: when we "describe graphs", do we talk about the
reference ("une pipe") or the picture? With Named Graphs, we talk
about the picture! Even more so with blank graphs (*some* picture).
But the "picture itself" depicts the value/meaning of the graph (which
is why the empty graph is True).

## Graph Terms

I do think perhaps one problem in my proposal is that graphs in the
object position *look* like types, at least for N3 users; just as the
"<< ... >>" form of quoted triples most definitely *look* like some
kind of compound IRIs! I try to cautiously approach the possibility of
"talking about the type" of compound statements in RDF. I say that
fully standing behind the fact that IRIs are references, *not* senses
(again, Frege [3]). But *use* of them is.

It would be very odd to introduce types for triples but continue to
leave the graphs in type/token limbo! *But* I think we're circling
around something we can agree on.

If I indulge the "type position", starting with the N3 form naming a
"graph type:

    _:ex1 sem:quotedGraph { <p1> :birthDate "1902" } .

I can see how proposing that exact notation in TriG and saying that
the object itself here is a blank graph *token* can be a problem. I
seem to "co-opt" what is a type in N3 as a blank graph *token* in
TriG? Of course, this depends on the relationship as well. ("Type or
token enabling properties" ...)

Instead of sem:quotedGraph we could imagine using skos:broadMatch, or
some ex:broaderInstantive. Or be audaciously bold and (you may want to
put on sunglasses when viewing this one) say:

    _:ex2 rdf:type { <p1> :birthDate "1902" } .

(I need to think hard about that of course, and test it out thoroughly.)

In any case, we should probably not define bnode names and IRI names
for graphs on opposite sides of the type/token distinction. If graph
names denote graph *tokens*, that would be a clear explanation for why
names don't denote the graphs "themselves" (as types).

(While it is another kind of conflation to equate graph tokens with
their serializations, that is a conflation we perhaps do all over RDF
with literals. We "get out" of that by adding D-entailment? I thus
still think that in TriG (and JSON-LD), graphs are indeed tokens, but
that inference may "enter the value space", and thus, possibly, go
from token to type.)

## Triple Terms

That I excluded the "<< ... >>" form was mainly since it is not needed
in my proposal. It introduces a difference without a (yet defined)
difference. Building on the above, we could bring that back, perhaps
to denote the triple type itself; but not before the above is
resolved. The *biggest* problem with it is its "recursive token"
notion. (As I wrote in the proposal, I think it is adding complexity
in the very substrate of RDF, which is supposed to be simple.)

But if this:

    << <p1> :birthDate "1902" >> :date "2023-09-21" .

simply means this:

    <urn:tdb:2014:urn%3Amd5%3Aabc6d701a21e16840530bd64d23d56ba> :date
"2023-09-21" .
    <urn:tdb:2014:urn%3Amd5%3Aabc6d701a21e16840530bd64d23d56ba>
owl:sameAs { <p1> :birthDate "1902" } .

Where that URI  *denotes* the graph type (made from a checksum of the
canonical ntriples-form of that triple); that's another thing. Then we
can relate it to occurrences:

    <urn:tdb:2014:urn%3Amd5%3Aabc6d701a21e16840530bd64d23d56ba>
:hasInstance _:ex1, _:ex2 .

This is off the top of my head; needing time to assess whether it
makes sense and is cohesive and usable.

Cheers,
Niklas

[1]: https://urldefense.com/v3/__https://www.slideshare.net/PatHayes/blogic-iswc-2009-invited-talk__;!!ACWV5N9M2RV99hQ!KT5u0hoN4sFIHdcVYMJIL-QXRgMmfaRJFCOwDwliu722eG0BRwlkER4NvA7GAUxIYy2IdCpGvjsoTMgKlvhKsqQ$

[2]: https://urldefense.com/v3/__https://en.wikipedia.org/wiki/Gettier_problem__;!!ACWV5N9M2RV99hQ!KT5u0hoN4sFIHdcVYMJIL-QXRgMmfaRJFCOwDwliu722eG0BRwlkER4NvA7GAUxIYy2IdCpGvjsoTMgKMldejw4$

[3]: https://urldefense.com/v3/__https://en.wikipedia.org/wiki/Sense_and_reference__;!!ACWV5N9M2RV99hQ!KT5u0hoN4sFIHdcVYMJIL-QXRgMmfaRJFCOwDwliu722eG0BRwlkER4NvA7GAUxIYy2IdCpGvjsoTMgKY76aepU$





On Wed, Sep 20, 2023 at 10:06 PM Pierre-Antoine Champin
<pierre-antoine@w3.org> wrote:
>
> As an amusing coincidence, Tim's rant during the TPAC F2F (and the
> discussions that followed) also got me thinking, and I wrote down a few
> things here:
>
> https://urldefense.com/v3/__https://hackmd.io/_kG54skVQle0KT2S85Tj1Q__;!!ACWV5N9M2RV99hQ!KT5u0hoN4sFIHdcVYMJIL-QXRgMmfaRJFCOwDwliu722eG0BRwlkER4NvA7GAUxIYy2IdCpGvjsoTMgKZjDYwiE$

>
> The approach is slightly less radical than Niklas' . I tried to explore
> various ways to support N3-like graph-terms, by breaking more or less
> (and hopefully less...) what we already have.
>
>
> On 20/09/2023 19:41, Niklas Lindström wrote:
> > Dear all,
> >
> > I have written a proposal for basing quotation and annotation upon
> > "blank graphs'" (graphs named by bnodes):
> >
> >      https://urldefense.com/v3/__https://gist.github.com/niklasl/4f52c32ef2d888c172c8584e36c24610__;!!ACWV5N9M2RV99hQ!KT5u0hoN4sFIHdcVYMJIL-QXRgMmfaRJFCOwDwliu722eG0BRwlkER4NvA7GAUxIYy2IdCpGvjsoTMgKXarDC0A$

> >
> > While I am rather convinced (by the various concerns raised) that this
> > is the right direction, it represents a fairly drastic change (for
> > RDF-star, that is; not for RDF 1.1). I did not approach it lightly,
> > having thought long and hard about where we are, and (carefully, I
> > hope) weighing alternatives, for many months, if not years.
> >
> > If there is interest, I can introduce it during the WG or Semantics TF
> > telecon this week.
> >
> > Best regards,
> > Niklas
> >
Received on Thursday, 28 September 2023 15:16:06 UTC