Re: Some questions on RDF 1.1 Reification Semantics from Pat Hayes on 2018-07-13 (semantic-web@w3.org from July 2018)

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 13 Jul 2018 14:42:37 -0700
To: thomas lörtsch <tl@rat.io>, semantic-web@w3.org
Message-ID: <72f707ef-9594-42ed-35ec-d86234812d2f@ihmc.us>
On 7/13/18 12:49 PM, thomas lörtsch wrote:
> I’m trying to understand what the RDF 1.1 Semantics Recommendation says about reification (*) but I’m having particular difficulties keeping up with the different kinds of triples it describes.

I will do my best to explain, but I should perhaps say up front 
that very few uses of reification have paid close attention to 
what the specs say about it. So this is more about what the WG 
intended, than about any actual reality.
> 
> At one point quite early in Appendix D.1 the Recommendation says:
> "Reification is not a form of quotation. Rather, the reification describes the relationship between a token of a triple and the resources that the triple refers to."
> I’m not a native speaker so some subtleties are lost with me. 

Rest assured that your grasp of the subtleties is better than 
that of most native speakers.

My best guess is that "token" here is meant as in 
type-token-distinction as at a later point the spec refers to "a 
particular instance or token of a triple".

Correct.

  However if the spec refers to token as in type-token then why 
is the reification not describing the type but the token?

Because the IRI which identifies the reified triple has to be 
interpreted in this way, in most of the (actual and potential) 
uses of reification that were being contemplated when the spec 
was being written. For example, the referent of this IRI was 
intended to be something that could be stored in a file and 
transmitted from place to place, was asserted by someone, had a 
provenance, etc.. In other words, it must be some piece of an 
actual concrete ('surface' or 'interchange') syntax, such as 
RDF-XML or TURTLE or N-triples.

Or would the spec, because it is (I guess) referring to unstated 
triples here,

The intention was to be neutral as to their statedness or otherwise.

  rather speak about (non-existing) instances than about their 
type? And if some triple with a specific subject/predicate/object 
is foremost a type how does that fit with the set semantics that 
there can be only one instance of that type?

? The semantics isn't relevant here. This is really purely an 
issue in syntax. (Or are you referring to the 'abstract' syntax, 
in which an RDF graph is a set? If so, see below.) There can of 
course be several instances of a triple (in different graphs, in 
any concrete syntax for RDF).

In my intuition it doesn’t. The dictionary also offers "symbol" 
and "representation" which can mean type or instance, so that 
doesn’t help either.
> 
> Shortly thereafter:
> "Reifications can be written with a blank node as subject, or with an IRI subject which does not identify any concrete realization of a triple, in both of which cases they simply assert the existence of the described triple."
> What is a "concrete realization"?

Yes, that is awkward. I meant simply a token, in some surface 
syntax.

  The next sentence more specifically speaks of "a concrete 
realization of an RDF triple, such as a document in a surface 
syntax" but does that exclude triples in databases, or only 
unstated triples?

No, it does not exclude them. They are after all represented in 
some syntactic form.

> 
> In the next sentence:
> "The subject of a reification is intended to refer to a concrete realization of an RDF triple, such as a document in a surface syntax, rather than a triple considered as an abstract object."
> What is a triple as an "abstract object": a triple that merely exists (in the sense that it has never actually been stated)? Or a triple that sits in a database as bits and bytes but not "concretely realized"? Or both? Or anything but a triple that has been "concretely realized"?

OK, I will confess that there is some conceptual confusion in the 
very heart of the RDF spec, and you have located it. The basic 
issue is the "abstract syntax" in which an RDF graph is defined 
to be a SET of triples. We took this path (which is highly 
unusual when defining languages, as you may know) in an attempt 
to have a cake and eat it. We wanted to describe RDF 'abstractly' 
so that there could be many different surface forms, including 
(when this was written, in 2004 – this text is copied directly 
from the original RDF 1.0 specification) surface forms that had 
not yet been invented, but we wanted to keep the specification as 
simple as we could, and in particular wanted to avoid an 
elaborate algebraic terminology of distinguishing 'abstract 
syntax' from 'surface syntax' and having to define a 
category-theoretic apparatus of mappings between them. For most 
of the development this has been reasonably effective, although 
it did cause us some grief regarding how to handle blank nodes; 
but the fact is, it is something of a conceptual muddle. What we 
SHOULD have done is something like what is described in my 
invited lecture here
https://www.slideshare.net/PatHayes/rdf-redux
but this only occurred to me later, when it was too late to 
adjust the spec to conform to it. (The 2014 WG that created RDF 
1.1 was prohibited by its very charter – and over my strenuous 
objections – from making such far-reaching changes to the 
underlying RDF structure.)

So, to return to actual RDF reification: the intention is that 
the subject node of an RDF reification always refers to a token – 
a piece of concrete syntax in some surface form of RDF, whether 
actually physically realized or not – and not to a set-theoretic 
or mathematical abstraction.

> 
> To sum up, there seem to exist:
> - abstract triples
> - concretely realized triples
> - simply existing triples
> - reified/described (but not quoted) tokens (of triples)

I would make a simpler distinction:

1. abstract triples
2. tokens (of triples) in some surface syntax for RDF, for 
example a line of three IRIs followed by a dot, in N_triples. I 
would make this category as inclusive as possible, including such 
things as rows of a table if that table is understood to encode RDF.

The second category can, if one wishes, be further subdivided 
into tokens which are physically instantiated at some time, 
versus those which are not: but a similar distinction can be made 
within any class of tokens, so this is nothing special to RDF.

The RDF spec is almost entirely concerned with 1, and 
deliberately avoids the topic of 2, but reification has to be 
understood to refer to 2, hence the awkwardness you have noticed.

> Which of them has actually be asserted somewhere, somehow. Does it make a difference how "concertely" it has been "realized" - in a database, serialized to a turtle document, etc?

No.

  Why does reification refer to a token, not the type?

Because nobody felt any need to be able to talk about triple 
types, but there was a strongly felt need to be able to talk 
about triple tokens. And we had to make a call one way or the 
other, as it would have been totally confusing to have left this 
open.

What is that token exactl
> 
> It might be more useful if the spec differntiated just between things that can be asserted and things that actually have been asserted. 

That might be interesting, but it is orthogonal to the 
distinction we were trying to draw.

Then some triple with a specific subject/predicate/object exists 
only once as something assertable, but it can be asserted many 
times (and each time may have an identifier, provenance etc).

Yes, exactly. If we presume that assertion must involve actual 
piece of surface syntax being asserted, then this is exactly the 
distinction between abstract and concrete that we were trying to 
explicate.

> 
> But back to one last question:
> "[…] asserting a triple does not automatically imply that any triple tokens exist in the universe being described by the triple. For example, the triple might be part of an ontology describing animals, which could be satisfied by an interpretation in which the universe contained only animals, and in which a reification of it was therefore false."
> That doesn’t look like anything to me…
> Does this suggest that a triple token is the real world realization of whatever the triple is refering to? I hope not, but I’m lost here anyway.

Perhaps this example wasn't very helpful.

Let us agree that a reification of a triple, when asserted, says 
that a token of that triple exists, but it does not assert that 
the triple being described is true: it does not assert the triple 
that it describes. What this paragraph is trying to say is that 
the dual is also the case: that if you were to assert a triple, 
that does not in itself also assert that a potential reification 
of that triple is true. This might seem counter-intuitive: after 
all, the asserted triple does exist, or you couldn't have 
asserted it. But the point is that the RDF graph containing the 
triple might be describing an ontologically limited 'world' which 
need not itself contain triples as entities. Put another way: the 
universe described by an RDF graph is not required, by the RDF 
specification, to contain the triples of that graph itself.

I hope that makes sense; but if it doesn't, I don't think much 
will turn on whether one follows it or not :-)

Best wishes

Pat Hayes


> 
> 
> Best,
> Thomas Lörtsch
> 
> 
> (*) not because I want to use it but because I want to precisely understand its semantics (or lack thereof)
> 
> 
> 
> 
> 
> 
> 

-- 
-----------------------------------
call or text to 850 291 0667
www.ihmc.us/groups/phayes/
www.facebook.com/the.pat.hayes
Received on Friday, 13 July 2018 21:43:23 UTC