RE: Why reification doesn't make sense (was: Re: PRIMER: PrimerReificationSection) from Pat Hayes on 2001-10-24 (w3c-rdfcore-wg@w3.org from October 2001)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Wed, 24 Oct 2001 12:11:14 -0500
To: "dehora" <dehora@eircom.net>
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p0510105fb7fc98ecd710@[205.160.76.193]>
>Hi Pat,
>
>Pat :
>>>
>The example it gives introduces a 'statement'
>
>  &lt;rdfprimer&gt;   &lt;editor&gt;   "eric".
>
>That looks like a triple to me. (My criterion is the trailing dot. :-)
>>>
>
><rdfprimer>   <editor>   "eric".
>
>Yes. It's a triple.
>
>
>Pat :
>>>
>What exactly IS a 'statement' ? Is it a particular  syntactic token, or
>a data structure, or something more like an expression? For example, if
>two different documents contain identical triples, how many statements
>is that? Two (that are isomorphic in some sense) , or one (that has two
>tokens) ?
>>>
>
>From the M&S:
>
>[[[
>A specific resource together with a named property plus the value of
>that property for that resource is an RDF statement. These three
>individual parts of a statement are called, respectively, the subject,
>the predicate, and the object.
>]]]

Look, there is no use in quoting the M&S at me when Im asking a 
question about what the M&S is supposed to mean, and in an area where 
I and about a thousand other people all over the planet know the M&S 
to be hopelessly confused. The M&S (in fact the entire W3C) does not 
define what it means by 'resource', so the quote you give is useless 
as a piece of explanatory prose if only because of the first three 
words in it. What I wanted to know is what YOU meant.

In particular, I was asking for clarification on whether 'statement' 
was to be understood in the token or type senses. Most language-words 
('word', 'sentence', 'verb', etc.) can be understood in either sense, 
and it often doesn't matter, or can be easily understood from the 
context. But in this case it matters a lot, so I wanted some 
clarification. The M&S provides no clarification on this question, 
and in the above cited passage seems to make nonsensical category 
errors; the predicate of a statement cannot BE a property, for 
example; at best, it can denote or mean a property. This passage 
seems to confuse elephant with 'elephant'.(Unless, possibly, it is 
using 'property' in some nonstandard sense. As with much of the 
existing M&S prose, it is genuinely impossible to know what the 
authors intended. If one takes them at face value they seem to be 
saying such outrageously silly things that one is forced into 
guessing what they intended, and trying to rationally reconstruct the 
ways they are misusing terminology.)

>There is also a set called 'Statements'. The M&S was never so clear on
>whether the set of statements contained abstract things (like integers)
>or the set of sets of inscriptions of statements, or a set of all the
>statement inscriptions in the world, or... A few people in rdf-ig,
>myself included,  agreed to agree that there was a set called
>statements, that this set contained abstract things, and that anything
>that might be inscribed in a text or a hard disk was a representation of
>a statement (probably what we call a triple here), if only because that
>let us each have a representation of the same statement in our inboxes.

Ah. OK, that is  the 'type' reading, rather than the 'token' reading. 
The statement is an abstraction, and any particular inscription (eg 
in a file or document) is a representation of (I'd prefer to say 
token of) it.

>Possibly this is where the term 'triple' comes from.
>
>I guess calling "<rdfprimer>   <editor>   "eric"." a statement is
>consistent with the M&S and a fair amount of article level literature;
>I've got no problem calling it a triple instead of a statement if that's
>more accurate.

Well, by 'triple' I mean a token, not a  type. I can store tokens in 
files and send them along wires, but types live in a Platonic heaven 
of nonphysical abstractions. (If I can give something a URL, then I 
know that whatever you call it, its a token, not a type.)

>
>
>Pat :
>>>
>You go on to explain how to say something about that statement:
>
>
>How would we go about doing this?  Well we know that the subject of a
>RDF
>statement is a resource denoted by a URI or an anonymous resource, and
>we know
>that these can also be the object of an RDF statement. So the thing to
>do would
>be associate the statement with a URI or an anonymous resource and use
>that in
>place of the statement directly. Here we'll use an anonymous resource,
>called
>'stmt'.
>
>
>Wait a minute; a miracle occurred in there somewhere. We have a triple
>(above) and we have a bNode which is 'associated' with it. How?? What is
>the nature of this 'association', and where and how is it recorded in
>RDF?
>>>
>
>Well, more like three card trick. I'm not sure what you're looking for
>Pat, but if you're asking for 'reification logically follows from' then
>you won't find it in the M&S. You ask how. There isn't any operational
>specification of how any more than there is a specification for writing
>down triples.

That's not good enough. You talk about an association. Where, and 
how, is that association encoded inside RDF, or under the RDF hood, 
or whatever? In what does it reside, operationally? This process has 
to be *implementable*; we aren't just talking semiotics here.

>You want to a statement denoted (and by denotation make it
>a resource available to RDF), you add four triples and you add them
>atomically.

OK, you generate the triples. What have they got to do with the first triple?

>
>Having re-learned recently that one can't get soundly from symbols to
>things in the world, then I would say that the miracle which is
>occurring is no different from the miracle of denotation that allows us
>to associate URIs with things.

We *don't* associate URIs with things (unless they are URLs, of 
course, or have some other kind of computable connection to their 
referents). If you think you do, you are living in a delusion. 
Delusions are fine as long as they are just a kind of social fiction, 
but reification is supposed to be something that a machine can do.

>The odd bit about reification is that we
>start with a triple and go from generating a symbol for that triple to
>generating a resource for that symbol, as if we magicked a resource to
>justify the existence of an anonymous resource symbol. Unless we say
>that the presence of a URI or an anonymous resource is an index of a
>resource and that it's always reasonable in RDF to abduct the existence
>of resource when we come across these indices.

The EXISTENCE, sure. I have no problem with that. But consider: I 
have a triple, as in your example. Then I reify it and I get four 
more triples. What connection or relationship do those four new 
triples have to my first triple? As far as I can see, the answer is 
NONE. So there is no 'association' between them. So don't pretend 
that there is, when writing the primer; as that constitutes lying to 
the customers.

>
>    <rdfprimer>   <editor>       "eric".
>    <w3.org>      <published>    _:stmt.
>    _:stmt        <gathered>     <w3.org>. 
>    _:stmt        <type>         <Statement>.
>    _:stmt        <subject>      <rdfprimer>.
>    _:stmt        <predicate>    <editor>.
>    _:stmt        <object>       "eric".
>
>
>It's true to say that nothing binds the triple with its denotation.

Ah, good. Then in what sense is the reification OF a triple actually 
'of' that particular triple as opposed to any other triple ( or 
indeed anything under the sun?) Reification is supposed to be used to 
say that triples are in files, are associated with one another, are 
in particular orderings, etc. They are supposed to be USED to REFER 
TO triples. How do they do that little semantic miracle?

>But
>that's true of any resource and its denotation in RDF. Maybe I shouldn't
>say 'associate'.

Resources don't denote in RDF, urirefs do. And a particular uriref 
doesn't have a unique denotation; it only has a denotation in a 
particular interpretation. There is no provision in RDF for naming, 
only for description. I'm sure that we all wish it were otherwise, 
and that many users of RDF will pretend that it is otherwise and may 
even get away with it for a while or in a limited context, but they 
are taking a risk by doing so. In general, in an open-ended Web 
environment, there is no way to justify any claim that any uriref has 
a unique denotation (other than by the use of something like a 
file-addressing protocol like http: or ftp:.) IF RDF had unversal 
quantifiers and equality then we could say that description named a 
single entity, but we don't have those luxuries.

>
>
>Pat :
>>>
>and we get the example of the reification
>
>
>    _:stmt    <type>       <Statement>.
>    _:stmt    <subject>    <rdfprimer>.
>    _:stmt    <predicate>  <editor>
>    _:stmt    <object>     "eric"  
>
>of the triple given first. Now, that is a *description* of a 'generic'
>such triple, right? It says that a thing exists which is a Statement and
>has these three property values. (Incidentally, it says that the subject
>is rdf:primer, not "rdf:primer", so it actually doesn't make sense; but
>let's leave that for now.)
>>>
>
>Cautious yes, not sure what 'generic' means.

I mean, anything that satisfies the description would count as 
something that satisfies the description. Like most descriptions, it 
doesn't uniquely pick out something in the world.

>(I don't understand the
>aside. Why would I need to put quotes around rdfprimer?

Because you aren't saying that the actual primer is the subject, 
right? The subject of the triple is the uriref, not what it denotes.

>). Like I said,
>what Statement is a set of is somewhat vague in my mind.
>
>
>Pat :
>>>
>But that isn't enough to pin down any particular statement with this
>form (the one in this particular document, say, or this particular
>file). Obviously, another statement with exactly the same form might be
>at this very moment being transmitted from somewhere in Iraq to
>somewhere in Turkey, and this description could be referring to that one
>instead of the one in my document. So, to repeat, HOW does RDF
>'associate' this description, or possibly the _:stmt name used in it
>(which, by the way, isn't even part of RDF syntax, so hardly seems
>suitable for the rather central role it ought to have), with the *actual
>statement* that it is supposed to be a reification of? Is there some
>kind of secret labelling process going on under the hood, which somehow
>attaches this anonymous node to the statement, thereby making it into a
>reified statement?
>>>
>
>No secret labelling. Any URI or anon resource that has a type property
>with the value Statement, denotes a reified statement.
>
>Just so I understand you: the reified statement (anon resource) denotes
>the abstract thing, what might be called the 'statement', not the triple
>that's been written down?

IN the above, I was assuming that 'statement' meant 'token' OK, now I 
follow your usage (I think), that was wrong.

>
>(And: why isn't '_:stmt' part of RDF syntax? It's an anonymous
>resource.)

No, its not. (That phrase 'anonymous resource' doesn't make sense in any case.
How can a resource BE anonymous? It might have a name I know nothing 
about; or it might get given one in the future by some alien being in 
a distant galaxy. ) The '_:' notation is local to the Ntriples 
syntax; it isn't part of RDF. There are no such labels in an RDF 
graph. The blank nodes do not have labels.

>Pat :
>>>
>OR have I got the wrong idea about statements altogether? If the answer
>to my original question is that two identical triples in two different
>documents are really only one statement, so that a statement is a class
>of tokens all with the same syntactic form, then the reification of a
>statement in four triples does indeed uniquely specify a single
>statement. But with THIS sense of 'statement', it doesn't make sense to
>say things like 'eric is the author of' a statement, since statements in
>this sense don't have authors (or perhaps, can have infinitely many
>authors), and similarly it doesn't make sense to say that document
>contains a statement (as opposed to a token of the statement, which does
>make sense but cannot be said in RDF).
>>>
>
>Well, the examples aren't saying that eric is the author of a statement;
>what's being said is that a statement was gathered from a resource and
>that the same statement was published by the resource.

That only makes sense for the 'token' sense of statement, the one 
that you seem to have rejected. How can one gather or publish a 
Platonic abstraction? That would be like claiming to publish 17 ( the 
integer, not the numeral.)

>That aside,
>suppose there _is_ a statement; here's its representation as a triple:
>
>   <rdfprimer>   <editor>   "eric".
>
>here is the reification according to the M&S:
>
>    _:stmt    <type>       <Statement>.
>    _:stmt    <subject>    <rdfprimer>.
>    _:stmt    <predicate>  <editor>
>    _:stmt    <object>     "eric"  
>
>Are you saying that a reification in RDF, is actually a reification of
>the statement and not of the triple, and what was intended is the
>reification of the triple?

I have, literally and with the best will in the world, no idea what 
you mean by 'reification'. It simply does not seem to make sense as 
described. Until you tell me whether you are using words like 
'triple', 'statement' and so on in type or token sense, there is no 
way to tell what you are talking about.

The best, perhaps the only, way I can make sense of it is this. A 
reification (the four triples above, eg) is a description in RDF of 
the grammatical form of an RDF triple. Such a reification 
(description) is true of an RDF triple-token T just in case T has the 
form described by the reification.  That makes sense; but it does not 
provide any way to associate a reification with any *particular* 
token that it can be said to be the reification *of*; so it cannot be 
used to, for example, assert any properties of any particular token, 
to count tokens, to say that a token is or is not in a particular 
document, or has been processed in any way or at any time, or indeed 
to say ANYTHING about ANY token, other than to say it has a certain 
grammatical form. So, other than perhaps enabling RDF to describe its 
own parsing rules, it seems singularly useless.

>That is, our denotation is denoting the wrong
>thing?

I don't know how to answer that question.

>
>Pat :
>>>
>There is a basic mismatch, either way: RDF reifications describe
>syntactic forms, but the things people want to use it for require it to
>refer to tokens of forms, not the forms themselves; and RDF reification
>provides no way to refer to a statement token.
>>>
>
>Assuming this is correct (I don't fully grasp the distinction you're
>making) and there is a fundamental mismatch to the extent we are not
>able to say what we intended to say using reification. Do we have a new
>issue?

Well, it's not new for me. I've been saying this ever since I first 
heard about RDF reification, loudly and at every opportunity. (And so 
have quite a few other people. :-)

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Wednesday, 24 October 2001 13:11:26 UTC