Re: A question about referential opacity (again) from Thomas Lörtsch on 2023-10-25 (public-rdf-star-wg@w3.org from October 2023)

From: Thomas Lörtsch <tl@rat.io>
Date: Wed, 25 Oct 2023 16:34:38 +0200
To: Doerthe Arndt <doerthe.arndt@tu-dresden.de>
Cc: Niklas Lindström <lindstream@gmail.com>, RDF-star WG <public-rdf-star-wg@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Message-Id: <5D8BF122-7D5A-4F27-A4AF-3996A45E41F5@rat.io>
Hi Dörthe,

> On 24. Oct 2023, at 19:10, Doerthe Arndt <doerthe.arndt@tu-dresden.de> wrote:
> 
> Dear Niklas,
> 
>> 
>> 
>> I assume that your worry is that for graph terms to work, you'd have
>> to match its signature (or arity)?
> 
> It can be, depending what the graph means. It is interesting to see what you’d expect, especially since I have another expectation (and so far, nothing is fixed, so we are both right ;) ).

There is an argument to be made that the Semantic Web as it exists today, and its established practices, and the expectations that can be deduced from those practices, should guide our design. Of course that still leaves a lot to disagree on, having different intuitions about, figure out, discuss, etc, but it also makes some approaches rather less plausible. My critique of the CG semantics was always that it puts the needs of logicians front and center and forces any mainstream usage to tediously circumvent that roadblock (e.g. via TEP). That IMO is an untenable value proposition to SemWeb developers in general. Entailment plays a small role on the semantic web, the even more specific needs of logicians that Notation3 caters for play an even smaller part. I’m decidedly not suggesting that those needs shouldn’t be met: I’d very much like them to be incorporated in the Semantic Web, but that can’t be achieved by forcing everybody to dance around them like an elephant in a porcellain shop. That way the logic porcellain is sure to be destroyed, and even the elephants will be unhappy.

That said IMO the assumption that the prevalent intuition on the Semantic Web is to query for closed graphs is misguided. This is a logicians view, and problem. The prevalent practice on the Semanti Web is that one looks for information, and results will most often not even come in triples, let alone graphs, but in terms. Organizing that information in graphs - nested or not - is not done to build nice graphs, they are very much sceondary, a means to an end. So your whole scenario where graphs are considered terms and as such - as closed entities, I hope I got that right - are reasoned about, is exotic on the Semantic Web. Again, I don’t consider it impractical or even useless, but decidedly not mainstream, and not fit to guide the design of a mainstream annotation mechansim.

But: the nested graph proposal - although over-engineered in some ways, so take this with a graon of salt - includes a facility to address fragments: subject, predicate, object, triple and graph - to ensure that its possible to precisely target annotations (because the source of a graph is not the source of the individual IRIs, an annotation on a term doesn’t apply to the whole graph, etc). This allows to disambiguate attributions. Maybe something similar could be defined to describe that a query is meant to target a graph very "closely" (pun intended)? 

The other proposal - to encode abstract graph types as datatyped RDF graph literals - notwithstanding.

Best,
Thomas

>> I don't think that's an issue. If
>> this:
>> 
>>   << dbr:Linköping ex:locatedIn dbr:Sweden >> ex:statedAt
>> "2023-10-23"^^xsd:date .
>> 
>> Was replaced with, or equivalent to (ignoring that this N3 cannot work
>> in TriG without lookahead parsing dealing with ambiguity, due to
>> default graph blocks):
>> 
>>   { dbr:Linköping ex:locatedIn dbr:Sweden } ex:statedAt
>> "2023-10-23"^^xsd:date.
>> 
>> And that thus, this is also possible:
>> 
>>   {
>>     dbr:Linköping a ex:City ;
>>       ex:locatedIn dbr:Sweden
>>   } ex:statedAt "2023-10-23"^^xsd:date.
>> 
>> Then I'd assume a query like (again ignoring that this syntax probably
>> won't fly in SPARQL):
>> 
>>   SELECT ?p ?o ?date {
>>     { dbr:Linköping ?p ?o } ex:statedAt ?date
>>   }
>> 
>> Would yield:
>> 
>>   | ex:locatedIn | dbr:Sweden | "2023-10-23"^^xsd:date |
>> 
>> In fact, this:
>> 
>>   SELECT ?p ?o ?date {
>>     { dbr:Linköping ?p ?o. ?s1 ?p1 ?o1 } ex:statedAt ?date
>>   }
>> 
>> should match too, just binding ?s1, ?p1 and ?o1 to each of the two
>> triples in turn (so an unperformant query, with unused redundant
>> results).
> 
> Mmm, so basically, you include my predicate log:includes implicitly to the query? Note that the question here is (and I think that was also one of the questions for the different TriG semantics): is the graph we state as a graph term open or closed? I would expect (but as all of us, I am biased), that if my graph has no name at all, that it is closed. So, if I state 
>   {
>     dbr:Linköping a ex:City ;
>       ex:locatedIn dbr:Sweden
>   } ex:statedAt "2023-10-23"^^xsd:date.
> 
> I am talking about the exact graph   { dbr:Linköping a ex:City ;  ex:locatedIn dbr:Sweden } and not of, for example, a graph containing these two triples (and maybe more). So, in my view the graph above does not yield
> 
> 
>   {
>     dbr:Linköping a ex:City .
>   } ex:statedAt "2023-10-23"^^xsd:date.
> 
> 
> Note, that the predicate here is confusing and that I can find predicates where this would make no sense like:
> 
> {:cat :is :alive, :dead } a :inconsitency.
> 
> Should not yield 
> 
> {:cat :is :alive } a :inconsitency.
> 
> So, in a sense we are back to the point where we need to consider the intended use and I think examples can get far more complex with graphs (which is also why we want them in the first place).
> 
> We do not need to follow N3, but there, graphs are closed and you can state relations between them and have predicates which help you to put them into relation. The reason is that otherwise you could never talk about a concrete graph as they do not have names.
> 
> I think my point here is just: we all have expectations on how the graphs should behave and most likely they differ.
> 
> 
>> 
>> This is based on what I think James also answered, that for named
>> graphs, if you have:
>> 
>>   _:g1 {
>>     dbr:Linköping a ex:City ;
>>       ex:locatedIn dbr:Sweden
>>   }
>>   _:g1 ex:statedAt "2023-10-23"^^xsd:date ;
>>     ex:source wikipedia:Linköping .
>> 
>> then this works:
>> 
>>   SELECT ?p ?o ?date {
>>     graph ?g { dbr:Linköping ?p ?o. ?s1 ?p1 ?o1 }
>>     ?g ex:statedAt ?date .
>>   }
>> 
>> This is because SPARQL BGPs in graph blocks match what's there;
>> they're not excluding graphs containing more triples. (I'm sure e.g.
>> Andy would phrase this much more correctly.)
> 
> You mean 
> 
>   SELECT ?p ?o ?date {
>     graph ?g { dbr:Linköping ?p ?o.  }
>     ?g ex:statedAt ?date .
>   }
> 
> Right?
> 
> I think the example here is easier because we have the graph name (even though it is a blank node) and this determines somehow which graph we mean. Here, you assume, that 
> 
>   _:g1 {
>     dbr:Linköping a ex:City ;
>       ex:locatedIn dbr:Sweden
>   }
> 
> „Means“ (in an informal sense) that there is the graph _:g1 and that this graph contains the triples      
> dbr:Linköping a ex:City ;
>       ex:locatedIn dbr:Sweden.
> 
> But _:g1 can contain more triples, it is open in that sense and if we want to talk about it, we use its label (the blank nodes _:g1).
> 
> There are many possible points of view.
> 
> 
> 
>> 
>> This all said, I'm unconvinced of either triple or graph terms, as
>> they make it possible to talk about the abstract type itself, as
>> opposed to a reified occurrence thereof (which when talked about is a
>> token of the type).
> 
> With this comment you just made clear for me what you mean by type vs. token in this context: you would like that in
> 
> <<:a :b :c>> :p :o.
> <<:a :b :c>> :pp :oo.
> 
> The two 
> 
> <<:a :b :c>> wohl refer to different instances? Right? If not, please correct me, because previously, I did not fully get that (always easier to know the own point of view and „being right“ than understanding someone else’s ;) ).  That would be important for your use case? I think that this can make things complicated, but before I complain (and construct evil examples), I need to fully understand. Would you want the << >>-notation to only be syntactic sugar for reification?
> 
> 
>> But I'll write more about that in another reply.
> 
> Looking forward to this.
> 
> Kind regards,
> Dörthe
> 
>> 
>> All the best,
>> Niklas
>> 
>> On Mon, Oct 23, 2023 at 6:07 PM Doerthe Arndt
>> <doerthe.arndt@tu-dresden.de> wrote:
>>> 
>>> Dear Thomas, all,
>>> 
>>> In addition to what Peter said about RDF-star semantics and opacity, I’d like to clarify the community group semantics a little bit more: remember that we talk about the meaning of triple terms and not of the constituents (subject, predicate, object) of these terms. What was done in the unstar-mapping was a kind of reification with which we represented the triple with a blank node and then connected the iris of the constituents to this blank node (using the correct predicates) and also the lexical representation of these constituents. With this „trick“ we allowed that the quoted triple interpretation to be aware of the lexical representation of the triple and, if needed, to differentiate between triples having different interpretations, but that was not forced and as Peter also mentioned, the concrete interpretation was left open.
>>> 
>>> For the working group semantics several possibilities have been discussed and they all rely on an interpretation function for the triple term (for example IT in Enrico’s case). This function maps to a resource (and it can do more, but does not need to). The interpretation function for the triple term can be applied on triples from the domain of discourse (then we can indeed combine it with IS or some alternative IS’), but it would for example also be possible to apply the IT function directly on the graphical representation of the triple (of course we need to be careful with blank nodes here). My point is just: please try to see the triple term  as a whole also as a resource to better understand the opacity.
>>> 
>>> To the rest of the discussion and the added complexity: apart from all the theoretical aspects we discuss here (and where I agree that graphs are more complex than triples), please also note that we would have to decide howto deal with quoted graph terms in practice. In SPARQL queries, it is relatively easy to search for a triple term having dbr:Linköping as subject, like:
>>> 
>>> Select ?p ?o ?date
>>> {
>>> << dbr:Linköping ?p ?o>> ex:statedAt ?date
>>> }
>>> 
>>> But to make a similar query for graphs, we either need to know the exact structure of the graph (that is: how many triples does it contain?) or we need to come up with extra Filter functions for SPARQL.
>>> If we have
>>> 
>>> { dbr:Linköping a ex:City; ex:locatedIn dbr:Sweden}  ex:statedAt „23.10.2023“^^xsd:date.
>>> 
>>> A query
>>> 
>>> Select ?p ?o ?date
>>> {
>>> {dbr:Linköping ?p ?o. ?s1 ?p1 ?o1} ex:statedAt ?date
>>> }
>>> 
>>> Would fire, but
>>> 
>>> Select ?p ?o ?date
>>> {
>>> {dbr:Linköping ?p ?o. ?s1 ?p1 ?o1. ?s2 ?p2 ?o3} ex:statedAt ?date
>>> }
>>> 
>>> would not. I am sure we can solve this problem together, but this adds complexity since we need to have a discussion on how we would like to solve it.
>>> 
>>> Side note: in N3 we would have a predicate log:includes for that and while it makes this case easier, it also adds complexity simply because your graph terms can contain blank nodes and you are back to a problem of simple entailment… (and I will not go further unless you ask :) )- In N3 you would do something like (I try to make it „SPARQL-style“ but I am not sure whether or not this makes it clear, so, feel free to ask):
>>> 
>>> Select ?p ?o ?date
>>> {
>>> ?graph  ex:statedAt ?date.
>>> ?graph log:includes {dbr:Linköping ?p ?o. }.
>>> }
>>> 
>>> 
>>> The log:includes is some kind of function which can give you elements of your graph.
>>> 
>>> I just added this here as one example to illustrate that Peter is right here: things get more complex if we have graph terms. I am sure that we can solve that together and I would like to do that with all of you, but at the same time I am worried that it will take too long…
>>> 
>>> Kind regards,
>>> Dörthe
>>> 
>>> 
>>>> Am 23.10.2023 um 16:47 schrieb Peter F. Patel-Schneider <pfpschneider@gmail.com>:
>>>> 
>>>> There is no restrictions on the sharing of resources between different interpretations in the RDF semantics.  Different interpretations of the same RDF graph can share resources.  Different interpretations of different RDF graphs in an RDF dataset can share resources.  Different interpretations of different RDF graphs that are not in an RDF dataset can share resources.
>>>> 
>>>> But there is nothing in the formal semantics of RDF graphs that depends on sharing or not sharing, except for the denotations of literals.  So no formal consequences would arise from forbidding interpretations sharing resources, again except for value spaces of datatypes.
>>>> 
>>>> On the other hand, much of the "intended" meaning of RDF graphs implies that different interpretations share resources, i.e., that the denotation of many IRIs are intended to be in some sense fixed between some interpretations.  One can even imagine a variation on RDF semantics where the denotation of an IRI is required to be the same in all interpretations.   This semantics would have to be somewhat unusual but it could be made to work.
>>>> 
>>>> Requiring denotation of IRIs to be fixed within an RDF dataset would likely also need an unusual semantics if it was to handle things like beliefs (or indeed any other kind of varying view of the identity of things in the world).
>>>> 
>>>> peter
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 10/23/23 09:00, Niklas Lindström wrote:
>>>>> On Sat, Oct 21, 2023 at 2:22 PM Peter F. Patel-Schneider
>>>>> <pfpschneider@gmail.com> wrote:
>>>>>> 
>>>>>> It's important to be clear as to what is formal and what is informal in
>>>>>> discussions of this sort.
>>>>>> 
>>>>>> Formally in the current semantics for RDF, all IRIs are mapped (via the map
>>>>>> IS) to resources by interpretations in RDF.  One generally says that an IRI E
>>>>>> denotes or refers to IS(E).
>>>>> Is it correct that the interpretation I of a graph G in a dataset D
>>>>> must not by definition share its "domain/universe" (that is its
>>>>> non-empty set IR of resources) with another graph G' in D (or any
>>>>> other dataset)? That is, this is undefined (i.e. this is the lack of
>>>>> semantics for datasets)?
>>>>> And if these universes may be different, is there possibly a different
>>>>> interpretation for each graph within the same dataset (or any
>>>>> combination of shared and isolated interpretations)?
>>>>> And finally, would the definition of such shared or not shared
>>>>> interpretations constitute a semantics for datasets? (Possibly, but
>>>>> perhaps not necessarily, in conjunction with a definition for what the
>>>>> relation is between the name and the graph in a named graph pair?)
>>>>> /Niklas
>>>>>> A formal semantics that provides for referential opacity of IRIs generally
>>>>>> provides a different mapping (let's call it IS') for IRIs that occur in opaque
>>>>>> contexts, i.e., inside triple terms.   The details may differ, but the targets
>>>>>> of this mapping are usually either left unspecified or are to some synthetic
>>>>>> resources (such as copies of IRIs).
>>>>>> 
>>>>>> So if one was to construct an interpretation in this sort of formal semantics
>>>>>> that actually included real cities in the world as resources and whose IS
>>>>>> mapping mapped IRIs that are generally accepted as names of cities to the
>>>>>> actual cities one would say that in transparent contexts, i.e., subjects,
>>>>>> objects, and properties of asserted triples, dbr:Linköping refers to the city
>>>>>> of Linköping but in opaque contexts, i.e., in triple terms, refers to
>>>>>> something else.  (It may be possible that some interpretations dbr:Linköping
>>>>>> in an opaque context does refer to the city, but the important point is that
>>>>>> there are interpretations where dbr:Linköping in an opaque context does not
>>>>>> refer to the city and that absent special constructs to force transparency on
>>>>>> opaque contexts there is no way to force an RDF graph to be false in all these
>>>>>> interpretations.)
>>>>>> 
>>>>>> peter
>>>>>> 
>>>>>> PS: The coordination group semantics uses a different mechanism entirely,
>>>>>> instead syntactically transforming graphs that contain triple terms to regular
>>>>>> RDF graphs.  Among other things, the opacity mechanism in this treatment
>>>>>> transforms IRIs in triple terms to literal strings.  So part of this semantics
>>>>>> is a relationship (similar to but not exactly denotes) from IRIs in opaque
>>>>>> contexts to sequences of Unicode code points.
>>>>>> 
>>>>>> PPS:  There are other ways of obtaining opacity.  If the working group
>>>>>> switches to graph terms the kind of semantics described above might not be
>>>>>> adequate and some other treatment might have to be used.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 10/21/23 07:00, Thomas Lörtsch wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> 
>>>>>>> Enrico was kind enough to guide me through the work of the Semantics TF in a one-on-one TelCo a week ago. However, when I now look at my notes, I’m again confused.
>>>>>>> 
>>>>>>> If I understood Enrico correctly then a referentially opaque IRI doesn’t refer to anything. However, it was my understanding of the CG report semantics that IRIs in quoted triples are interpreted, but strictly following the syntactic form. My reading of the unstar-mapping supports that intuition [1].
>>>>>>> To give an example, I understood referential opacity as meaning that "dbr:Linköping" and "DBR:LINKÖPING" both refer to the city of Linköping, and yet are not equal (and can not infered to be equal) because their lexical representation differs.
>>>>>>> But according to how I understood Enrico they don’t refer to anything.
>>>>>>> 
>>>>>>> Was I wrong all along? Am I just not getting it and does there exist a world in which both interpretations are true? Or has the TF diverged from the CG? Or is there no consensus in the TF?
>>>>>>> 
>>>>>>> 
>>>>>>> Best,
>>>>>>> Thomas
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> [0] https://github.com/w3c/rdf-star-wg/wiki/Semantics%3A-Behaviour-catalogue
>>>>>>> [1] https://w3c.github.io/rdf-star/cg-spec/2021-12-17.html#mapping
>>>>>> 
>>>> 
>>> 
>
Received on Wednesday, 25 October 2023 14:34:51 UTC