Re: Slides: Talking About Occurrences

Dear Dörthe,

On Fri, Oct 27, 2023 at 3:56 PM Doerthe Arndt
<doerthe.arndt@tu-dresden.de> wrote:
>
> Dear Niklas,
>
> I also had a look to your slides and now wonder where you stand.

I'll attempt to clarify. (I hope my earlier reply to Antoine Z gave
some answers for your questions too. A conference got the best of me
so this answer lags behind.)

> If I get you correctly here, you argue that you would like to interpret RDF-star triples in a way that in
>
> :s :p <<:a :b :c>>.
> :k :l <<:a :b :c>>.

I don't propose to interpret RDF-star triples in any other way than
the CG report; I propose to not introduce them in the first place. I
doubt that there is value in keeping the above syntax if we do not add
them as terms.

(Any system is allowed to support non-standard extensions with
whatever syntax they want. But it's a disaster for interoperability,
so I'd strongly advise against it.)

> the two <<:a :b :c>> refer to different resources in the domain of discourse. You call that the "interpretation as tokens“. If I read the text of Pat Hayes on slide 7, I see that he is using the relation type vs. token as follows: for him the type is the abstract graph and the token is the concrete representation. As he talks about named graphs, he uses the name to distinguish between two tokens, such as in:
>
> :g1 {:a :b :c}
> :g2 {:a :b :c}
>
> So, here, the case is rather easy and even the semantics, which maps tokens to types will not have problems because :g1 and :g2 are easy to distinguish. In my view, if we have
>
> :s :p <<:a :b :c>>.
> :k :l <<:a :b :c>>.
>
> we have the exact same token twice, namely <<:a :b :c>>. Of course you could understand this as syntactic sugar for reification which then would allow you to have two different resources, if you want (they could still be the same though). But what stops you from taking for example the iris as types? Everything would be broken in that case, I know. But still, to me this looks like a random choice to take triple terms as tokens while iri terms are types.

Yes, and that is why I don't think triple terms (being types) are
really any help. The cost of introducing them is high (it changes the
fundament of RDF), and the value, a "better syntax for reification",
with opacity to turn it into "real" quotation, could be realized by
formalizing named graphs instead. Those are the paths I think have not
been thoroughly explored, or at least not been explicitly rejected as
non-options with clear motivations as to why they are irredeemable.

> What also puzzles me  is: If you go for types in your sense, how does this play together with your wish for referential transparency? If we have that :a and :a27 refer to the exact same resource, how would you argue that
>
> :s :p <<:a :b :c>>.
>
> Is the same as
>
> :s :p <<:a27 :b :c>>.
>
> while the first <<:a :b :c>> is not even the same as the second? Or how do you see the opacity vs. transparency here? I ask, because I saw all the texts you quoted as a reason why we needed opacity (and then I realized that I should not use quotes to make my point since there is interpretation involved :D, therefore I do not play the „quoting game“ and ask instead what YOU think).

If those were defined as two opaque triple terms, I wouldn't argue
against that. If they were two tokens of transparent triples, those
triples themselves could still through entailment become identified as
the same triple. If I viewed them as just sugar for plain old
reification, they are two tokens of two statements, whose subjects
would be entailed as equal. That is:

    prefix s: <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject>
    prefix p: <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate>
    prefix o: <http://www.w3.org/1999/02/22-rdf-syntax-ns#object>

    :s :p _:s1 .
    _:s1 s: :a ; p: :b ; o: :c .

    :s :p _:s2 .
    _:s2 s: :a27 ; p: :b ; o: :c .

    :a owl:sameAs :a27 .

Would entail:

    _:s1 s: :a27 .
    _:s2 s: :a .

But as said, I don't think it's necessary to use this syntax; I think
it might be misleading (since it "looks" like some composite IRI-like
construct). Why would we add syntactic sugar for reficiation that
merely saves a few keystrokes, but does not support what rdf:ID on
arcs in RDF/XML has supported for two decades; nor for that matter
supports the common grouping under a shared subject that all useful
concrete syntaxes do? Real data rarely looks like that.

> At the moment, I really think that you want syntactic sugar for reification, but most likely I am wrong (I remember that you said that was not the case?).

I would prefer if the RDF-star annotation syntax (or a variant
thereof) was syntactic sugar for named graphs; ideally with a
semantics for named graphs, and crucially standards for what goes into
the union default graph of a graph store. And I'm interested in the
entailment extension proposed in Named Graphs, 2005, which defines
that named singleton sets ("named triples") entail reified statements,
so that their current disconnect is mended. And if we cannot agree on
semantics for named graphs for RDF 1.2, just sugar for reification
could be a minimal initial step towards that.

To illustrate this, consider the somewhat involved case of
<https://github.com/w3c/rdf-ucr/wiki/RDF-star-for-CIDOC-CRM-events>.

The following is an expression of that information using named graphs:

    ex:Ioannes_68 a crm:E21_Person , ex:Gender_Eunuch ;
      rdfs:label "John the Orphanotrophos" .

    graph <#assignment-1> { ex:Ioannes_68 a ex:Gender_Eunuch }
    <#assignment-1> a crm:E17_Type_Assignment ;
      crm:P14_carried_out_by ex:Paphlagonian_family ;
      rdfs:label "Castration gender assignment" .

    graph _:g2 { ex:Ioannes_68 a ex:Gender_Male }
    _:g2 a crm:E17_Type_Assignment ;
      crm:P14_carried_out_by ex:emperor ;
      crm:P182_inverse_starts_after_or_with_the_end_of <#assignment-1> ;
      rdfs:label "Gender assignment by decree" .

    graph _:g3 { ex:Ioannes_68 a ex:Gender_Male }
    _:g3 a crm:E17_Type_Assignment ;
      crm:P14_carried_out_by ex:Paphlagonian_family ;
      crm:P183_ends_before_the_start_of <#assignment-1> ;
      rdfs:label "Birth gender assignment" .

Here is a condensed form, using my proposed *variant* of annotation
syntax, which supports regular IRIs and multiple annotations (token
occurrences), by "tagging" objects, using curlies, with either IRIs or
BNodes; and where objects prefixed by a "--" quotation dash ensure
that the triples they complete are not asserted in the default graph,
only in the named graphs:

    ex:Ioannes_68 a crm:E21_Person ;
      rdfs:label "John the Orphanotrophos" ;
      a ex:Gender_Eunuch {<#assignment-1>} ,
        -- ex:Gender_Male {[ a crm:E17_Type_Assignment ;
            crm:P14_carried_out_by ex:emperor ;
            crm:P182_inverse_starts_after_or_with_the_end_of <#assignment-1> ;
            rdfs:label "Gender assignment by decree"
          ] [ a crm:E17_Type_Assignment ;
            crm:P14_carried_out_by ex:Paphlagonian_family ;
            crm:P183_ends_before_the_start_of <#assignment-1> ;
            rdfs:label "Birth gender assignment" ]} .

    <#assignment-1> a crm:E17_Type_Assignment ;
      crm:P14_carried_out_by ex:Paphlagonian_family ;
      rdfs:label "Castration gender assignment" .

This could also be mapped to reification, *or* this reification could
be entailed by the singleton sets (as proposed in the Named Graphs
paper from 2007). An excerpt from the above could be:

    <#assignment-1> a crm:E17_Type_Assignment ;
      rdf:subject ex:Ioannes_68 ;
      rdf:predicate rdf:type ;
      rdf:object ex:Gender_Eunuch ;
      crm:P14_carried_out_by ex:Paphlagonian_family ;
      rdfs:label "Castration gender assignment" .

Every submitted use case is representable like this. By using
reification, nothing needs to be changed in RDF apart from adding
syntactic sugar. But there would be no support for opacity (which I
haven't seen as necessary in the collected use cases; but can imagine
the need for, e.g. when managing knowledge from multiple belief
systems, or weird cartoon worlds using conflated identities). Being
able to succinctly group multiple statements into a named graph and
talk about that is also reasonable, which is why I prefer to explore
the named graph route. (If it was left up to reification, it would
require some new relation to group reified statements under a common
subject, and while doable would be rather unwieldy to manage manually.
I wouldn't mind defining an entailment extension relating named graphs
to such reified statement "bags" though, for the sake of academic
comprehension.)

We come to the interesting (and challenging) questions if we want to
manage multiple named graphs in a dataset where some graphs are not
"accepted" in the union default graph, but rather are "owned", as
quoted, by other named graphs in the dataset. Standards for that have
left us wanting for a long time. Only JSON-LD documents have this in
any practically usable form. Many graph stores support some such
advanced management (e.g. graph groups, virtual graphs or multiple
datasets), but not in a standardized way.

Best regards,
Niklas





> Kind regards,
> Dörthe
>
>
> > Am 27.10.2023 um 12:06 schrieb Antoine Zimmermann <antoine.zimmermann@emse.fr>:
> >
> > Niklas,
> >
> >
> > A few comments about your slides (not necessarily about your proposal):
> >
> > First, the good things: RDF reification is indeed under-used, but it is used. Especially, it has been used in significant datasets like uniprot when the default syntax for RDF was RDF/XML. RDF/XML has syntactic sugar for reification, which makes it super easy to write. One reason people don't like reification is because it is verbose and cumbersome. But RDF lists are also verbose and cumbersome if written as triples. Yet, with the right syntax, good practices, and dedicated primitives in programming, they are well accepted and well supported. The same could be true with reification. So, yes, "quoted triples" as a way to simplify the use of standard RDF reification is an option that should be on the table. But the big problem is that the semantics is not constraining at all, and people may have completely different practices in the way they use reification. However, as opposed to named graphs, RDF reification has a normative semantics, although it is very weak.
> >
> >
> > Second, the criticism, in details:
> >
> > Slide 6 has the title "RDF 1.1 Concepts" and subtitle "on reification", but the text you put on the right is from RDF Schema. "Concepts" don't say anything about reification. Moreover, this text is in a section that is not normative. Formally, the semantics of reification does not imply that a reified triple is a token or anything. According to the standard, one could interpret a reified triple as the triple itself and it would not violate anything normative.
> >
> > Then in Slide 7, what is written is Pat Hayes's idea of a named graph. But as far as the standards are concerned (SPARQL 1.0, SPARQL 1.1, and RDF 1.1), named graphs are *only* pairs (n,g) and that's all. You may interpret this as a token of a graph with a name if you want, but again, this is not normative and there are other ways to interpret it.
> >
> > In Slide 24, it is written "A triple is identified with the singleton set containing it", and a subtitle says "RDF 1.1 Semantics". Clearly, an element and the singleton that contains it are never the same, but they may be identified in certain contexts. I do not understand to which context you refer here. The mention of RDF 1.1 Semantics is misleading because RDF 1.1 Semantics does not have this identification. In fact, quite the opposite: if they were identified, then:
> >
> > { <me> <wears> _:b . _:b a <Hat> }
> >
> > would be identified with
> >
> > {{<me> <wears> _:b}, {_:b a <Hat>}}
> >
> > But these two sets mean different things. The second one does not imply the first one. First one says "I wear a hat", while second one says "I wear something. There exists a hat."
> >
> > Slide 30: Again, "The <name, graph> pair is a token of its mathematical graph." is one way of interpreting the pair. Imagine I have a pair (iri, n), where iri is an IRI and n is a natural number. Would you interpret this as a token for the mathematical number n? For instance, instead, if iri is a DOI, n could be the number of times the document was printed.
> >
> > Also, "This token, which is denoted by this name" is your interpretation. "Denote" is formally defined in RDF 1.1 Semantics: https://www.w3.org/TR/rdf11-mt/#dfn-denote, so when you use this term in the context of RDF, it suggests that you talk about what RDF Semantics says. But RDF Semantics does not say that the graph name denotes anything in particular.
> >
> > Slide 34: I don't understand or I simply disagree with some of the statements: "...nested graphs? (...) Requires “graph literals” (...)" -> I don't see how this follows from that.
> >
> > "… graph terms? Same problem as for triple terms -
> > these are abstract mathematical objects denoting
> > themselves." -> graph terms are just a syntactic structure. This does not imply anything about what they denote or not.
> >
> >
> >
> > Additionally, there are parts where it is hard to understand what you mean. Your spoken words yesterday explained some things but sometimes even with your verbal presentation, I had trouble figuring out what your proposal(s) consist(s) in exactly.
> >
> >
> >
> > --AZ
> >
> > Le 26/10/2023 à 20:37, Niklas Lindström a écrit :
> >> Dear all,
> >> Here are the slides I presented at today's teleconference.
> >> Best regards,
> >> Niklas
> >> (PS. Escher's Dragon is pixelated to avoid copyright issues.)
> >
> > --
> > Antoine Zimmermann
> > École des Mines de Saint-Étienne
> > 158 cours Fauriel
> > CS 62362
> > 42023 Saint-Étienne Cedex 2
> > France
> > Tél:+33(0)4 77 49 97 02
> > http://www.emse.fr/~zimmermann/
> >
>

Received on Monday, 13 November 2023 14:56:47 UTC