Re: Some questions on RDF 1.1 Reification Semantics

Thanks a lot for the thorough explanation which really cleared up things for me (some further remarks inline below). 

With those questions out of the way I’m now able to get to my main issue which is about the relation between the token and its reification. Sorry about the verbosity... Lets take the example from the spec, the triple token
    ex:a ex:b ex:c .
and its reification 
    ex:graph1 rdf:type rdf:Statement .
    ex:graph1 rdf:subject ex:a .
    ex:graph1 rdf:predicate ex:b .
    ex:graph1 rdf:object ex:c .

I understand (and intuitively agree) that 
    ex:graph1
doesn’t entail the same consequences as the triple it reifies,
    ex:a ex:b ex:c .
since ex:graph1, although correctly and completey described above, in itself doesn’t actually state
    ex:a ex:b ex:c .
It just describes that statement, or even more precisely: such a statement.

I would now have hoped that the following graph was a semantically sound way to make an assertion about the provenance of the triple token of interest:
    ex:a ex:b ex:c .
    ex:graph1 rdf:type rdf:Statement .
    ex:graph1 rdf:subject ex:a .
    ex:graph1 rdf:predicate ex:b .
    ex:graph1 rdf:object ex:c .
    ex:graph1 ex:prov ex:rdf11mt .
as the triple token and its reification sit side by side in the same graph and so intuitively the denotation of ex:graph1 seems pretty definitive.

The spec however seems to say that this is not the case as e.g.
    ex:a ex:b ex:c .
    ex:graph1 rdf:type rdf:Statement .
    ex:graph1 rdf:subject ex:a .
    ex:graph1 rdf:predicate ex:b .
    ex:graph1 rdf:object ex:c .
    ex:graph1 ex:prov ex:rdf11mt 
    ex:graph2 rdf:type rdf:Statement .
    ex:graph2 rdf:subject ex:a .
    ex:graph2 rdf:predicate ex:b .
    ex:graph2 rdf:object ex:c ..
wouldn’t entail
    ex:graph2 ex:prov ex:rdf11mt .
Or am I jumping to conclusions here?

To rephrase the question: the spec says "The subject of a reification is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax, rather than a triple considered as an abstract object.", adding some usecases like provenance of triples etc. However it also says: "suppose that IRI ex:graph1 is used to identify this graph. Exactly how this identification is achieved is external to the RDF model". That seems to leave every use of reification semantically unspecified, even very clear cases like "ex:graph1 ex:prov ex:rdf11mt ." above?! 
If — as I fear — every use of reification is semantically unspecified then why is the spec so reluctant, leaving the very use case of reification in a semantic limbo? After all in the case of triple tokens we’re not faced with the problem of graph labeling/naming that plagued the work on named graphs.


> On 13. Jul 2018, at 23:42, Pat Hayes <phayes@ihmc.us> wrote:
> 
> On 7/13/18 12:49 PM, thomas lörtsch wrote:
>> I’m trying to understand what the RDF 1.1 Semantics Recommendation says about reification (*) but I’m having particular difficulties keeping up with the different kinds of triples it describes.
> 
> I will do my best to explain, but I should perhaps say up front that very few uses of reification have paid close attention to what the specs say about it. So this is more about what the WG intended, than about any actual reality.
>> At one point quite early in Appendix D.1 the Recommendation says:
>> "Reification is not a form of quotation. Rather, the reification describes the relationship between a token of a triple and the resources that the triple refers to."
>> I’m not a native speaker so some subtleties are lost with me. 
> 
> Rest assured that your grasp of the subtleties is better than that of most native speakers.
> 
>> My best guess is that "token" here is meant as in type-token-distinction as at a later point the spec refers to "a particular instance or token of a triple".
>> 
> Correct.
> 
>> However if the spec refers to token as in type-token then why is the reification not describing the type but the token?
> 
> Because the IRI which identifies the reified triple has to be interpreted in this way, in most of the (actual and potential) uses of reification that were being contemplated when the spec was being written. For example, the referent of this IRI was intended to be something that could be stored in a file and transmitted from place to place, was asserted by someone, had a provenance, etc.. In other words, it must be some piece of an actual concrete ('surface' or 'interchange') syntax, such as RDF-XML or TURTLE or N-triples.

Okay. I think what threw me off most was when the spec says it can’t define something but then speaks about it all the same. The spec says the identification of a token by a reification is not defined (so, in my words, the relation is brittle at best). My attempts to interpret the text then assumed that it wouldn’t speak about that use case any further. But it does. Because it’s an important use case etc. 

>> Or would the spec, because it is (I guess) referring to unstated triples here,
> 
> The intention was to be neutral as to their statedness or otherwise.

Aha! And I operated the whole time under the impression that honouring this distinction was the reason for all those contortions.

>> rather speak about (non-existing) instances than about their type? And if some triple with a specific subject/predicate/object is foremost a type how does that fit with the set semantics that there can be only one instance of that type?
> 
> ? The semantics isn't relevant here. This is really purely an issue in syntax. (Or are you referring to the 'abstract' syntax, in which an RDF graph is a set? 

yes

> If so, see below.)

> There can of course be several instances of a triple (in different graphs, in any concrete syntax for RDF).
> 
>> In my intuition it doesn’t. The dictionary also offers "symbol" and "representation" which can mean type or instance, so that doesn’t help either.
>> Shortly thereafter:
>> "Reifications can be written with a blank node as subject, or with an IRI subject which does not identify any concrete realization of a triple, in both of which cases they simply assert the existence of the described triple."
>> What is a "concrete realization"?
> 
> Yes, that is awkward. I meant simply a token, in some surface syntax.
> 
>> The next sentence more specifically speaks of "a concrete realization of an RDF triple, such as a document in a surface syntax" but does that exclude triples in databases, or only unstated triples?
> 
> No, it does not exclude them. They are after all represented in some syntactic form.

Just to disclose my intention: I’m pounding on this issue so much because my real interest is not provenance of some graph "realization" but meta modelling, attributed graphs and the like. That does of course happen mostly within some database and isn't overly concerned with serializations to documents. But that will be the topic of another mail.

>> In the next sentence:
>> "The subject of a reification is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax, rather than a triple considered as an abstract object."
>> What is a triple as an "abstract object": a triple that merely exists (in the sense that it has never actually been stated)? Or a triple that sits in a database as bits and bytes but not "concretely realized"? Or both? Or anything but a triple that has been "concretely realized"?
> 
> OK, I will confess that there is some conceptual confusion in the very heart of the RDF spec, and you have located it. The basic issue is the "abstract syntax" in which an RDF graph is defined to be a SET of triples. We took this path (which is highly unusual when defining languages, as you may know) 
 
Lamentably I don’t know much more than what I learned through reading the specs and going through the RDF 1.1 WG archives. I’m taking reading recommendations.

> in an attempt to have a cake and eat it. We wanted to describe RDF 'abstractly' so that there could be many different surface forms, including (when this was written, in 2004 – this text is copied directly from the original RDF 1.0 specification) surface forms that had not yet been invented, but we wanted to keep the specification as simple as we could, and in particular wanted to avoid an elaborate algebraic terminology of distinguishing 'abstract syntax' from 'surface syntax' and having to define a category-theoretic apparatus of mappings between them. For most of the development this has been reasonably effective, although it did cause us some grief regarding how to handle blank nodes; but the fact is, it is something of a conceptual muddle. What we SHOULD have done is something like what is described in my invited lecture here
> https://www.slideshare.net/PatHayes/rdf-redux
> but this only occurred to me later, when it was too late to adjust the spec to conform to it. (The 2014 WG that created RDF 1.1 was prohibited by its very charter – and over my strenuous objections – from making such far-reaching changes to the underlying RDF structure.)

Yes, I read your mails about the process in the WG archives and I experienced those strong worries to derail uptake of RDF and the LOD effort by making more than the most modest changes to RDF first hand at the time. I hope that with the advent of Knowledge Graphs, Property Graphs, Attributed Graphs etc the increasing demand for and use of meta modelling techniques will bring another chance to standardize this properly in RDF.

> So, to return to actual RDF reification: the intention is that the subject node of an RDF reification always refers to a token – a piece of concrete syntax in some surface form of RDF, whether actually physically realized or not – and not to a set-theoretic or mathematical abstraction.
> 
>> To sum up, there seem to exist:
>> - abstract triples
>> - concretely realized triples
>> - simply existing triples
>> - reified/described (but not quoted) tokens (of triples)
> 
> I would make a simpler distinction:
> 
> 1. abstract triples
> 2. tokens (of triples) in some surface syntax for RDF, for example a line of three IRIs followed by a dot, in N_triples. I would make this category as inclusive as possible, including such things as rows of a table if that table is understood to encode RDF.
> 
> The second category can, if one wishes, be further subdivided into tokens which are physically instantiated at some time, versus those which are not: but a similar distinction can be made within any class of tokens, so this is nothing special to RDF.
> 
> The RDF spec is almost entirely concerned with 1, and deliberately avoids the topic of 2, but reification has to be understood to refer to 2, hence the awkwardness you have noticed.
> 
>> Which of them has actually be asserted somewhere, somehow. Does it make a difference how "concretely" it has been "realized" - in a database, serialized to a turtle document, etc?
> 
> No.
> 
>> Why does reification refer to a token, not the type?
> 
> Because nobody felt any need to be able to talk about triple types, but there was a strongly felt need to be able to talk about triple tokens. And we had to make a call one way or the other, as it would have been totally confusing to have left this open.

Okay, and I always assumed that _this_ can’t be the answer to my question as the connection between token and reification is so brittle (leading to my questions above on that topic).

>> What is that token exactly?
>> 
>> It might be more useful if the spec differntiated just between things that can be asserted and things that actually have been asserted. 
> 
> That might be interesting, but it is orthogonal to the distinction we were trying to draw. 

>> Then some triple with a specific subject/predicate/object exists only once as something assertable, but it can be asserted many times (and each time may have an identifier, provenance etc).
> 
> Yes, exactly. If we presume that assertion must involve actual piece of surface syntax being asserted, then this is exactly the distinction between abstract and concrete that we were trying to explicate.

Cool!

>> But back to one last question:
>> "[…] asserting a triple does not automatically imply that any triple tokens exist in the universe being described by the triple. For example, the triple might be part of an ontology describing animals, which could be satisfied by an interpretation in which the universe contained only animals, and in which a reification of it was therefore false."
>> That doesn’t look like anything to me…
>> Does this suggest that a triple token is the real world realization of whatever the triple is refering to? I hope not, but I’m lost here anyway.
> 
> Perhaps this example wasn't very helpful.
> 
> Let us agree that a reification of a triple, when asserted, says that a token of that triple exists, but it does not assert that the triple being described is true: it does not assert the triple that it describes. What this paragraph is trying to say is that the dual is also the case: that if you were to assert a triple, that does not in itself also assert that a potential reification of that triple is true. This might seem counter-intuitive: after all, the asserted triple does exist, or you couldn't have asserted it. But the point is that the RDF graph containing the triple might be describing an ontologically limited 'world' which need not itself contain triples as entities. Put another way: the universe described by an RDF graph is not required, by the RDF specification, to contain the triples of that graph itself.

I see - luckily that seems to be quite a corner case. 

Thomas



> I hope that makes sense; but if it doesn't, I don't think much will turn on whether one follows it or not :-)
> 
> Best wishes
> 
> Pat Hayes
> 
> 
>> Best,
>> Thomas Lörtsch
>> (*) not because I want to use it but because I want to precisely understand its semantics (or lack thereof)
> 
> -- 
> -----------------------------------
> call or text to 850 291 0667
> www.ihmc.us/groups/phayes/
> www.facebook.com/the.pat.hayes
> 
> 

Received on Sunday, 15 July 2018 21:55:28 UTC