Re: summary of reification semantics issues (material for discussion). from pat hayes on 2003-03-13 (w3c-rdfcore-wg@w3.org from March 2003)

From: pat hayes <phayes@ai.uwf.edu>
Date: Thu, 13 Mar 2003 16:48:23 -0600
To: Tim Berners-Lee <timbl@w3.org>
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p05111b00ba96a38db5f1@[10.0.100.86]>
>Thank you Pat for this summary.  I find in it
>a conclusion that de-re semantics for reification are useful,
>which is not the conclusion to which I come. As this affects running
>code, I will go through the message looking for differences.
>
>On Tuesday, Mar 11, 2003, at 18:17 US/Eastern, pat hayes wrote:
>
>>Summary of the semantic alternatives for RDF reification.
>>---------------------------
>>A reification assertion such as
>>
>>aaa rdf:subject bbb .
>>
>>is saying that something is the subject of something else which is 
>>an RDF triple. There are at least two orthogonal dimensions on 
>>which one can choose alternatives about what exactly these 
>>'somethings' are.
>>
>>aaa  might refer to a triple as an abstract grammatical form, or it 
>>might refer to a particular occurrence of a triple in some actual 
>>document somewhere. This has been referred to as the 
>>'statement/stating' difference.
>>
>>bbb might refer to a piece of syntax - the actual subject 'node', 
>>or subject token in a document - which occurs in the RDf graph, or 
>>it might refer to the thing that is considered to be the subject of 
>>the proposition expressed by that triple. This has been referred to 
>>as the 'de-dicto/de-re' difference.  This can be dramatized by 
>>asking: is the rdf:subject of the triple
>>
>>ex:Mary ex:had ex:littleLamb .
>>
>>a girl or a uriref  (respectively de dicto or de re)  ?
>
>I think you mean the other way around, no?  The de re interpretation is that
>the subject is the thing, Mary, and the de dicto that it is that 
>which was said, the
>URI?

Yes, whoops. Thanks for catching that.

>
>>These two choices are independent, but each influences the 
>>significance of the other.
>>
>>Currently, the semantics document recommends that one should 
>>interpret this vocabulary so that aaa is interpreted as a stating, 
>>ie referring to a concrete occurrence of a triple rather than an 
>>abstract form; and that bbb should be interpreted de re, ie as 
>>referring to Mary rather than to Mary's URIref.
>>
>>The reasoning behind these choices are as follows.
>>
>>Statings allows one to associate provenance information with 
>>triples in a document a document by using reification to talk about 
>>the triples. This would be impossible (meaningless) if we imposed 
>>the 'statement' interpretation.
>
>I think this is an argument which doesn't make sense if yo try it.
>You can't really distinguish between a stating and a statement until you
>give a test case, which involves more relationships.

The difference is kind of invisible in RDF, but it shows up in OWL.

>While a triple  {ex:Mary ex:had ex:littleLamb} is represented merely by
>its three parts, there is no association with a particular source.
>There is nothing to indicate that it is a stating.

True, and that's one reason why we didn't try to formalize this 
difference but only asserted it in English text. But since this issue 
had come up several times and seemed to generate a lot of heat, we 
thought it would be good to state the intention as clearly as we 
could.

>Suppose that the triple indicates just the abstract statement.
>We can reify this by giving the strings which make up its parts.
>
>ex:s1  rdf:subjectSymbol  "http://www.example.com/pat12#Mary";
>            rdf:predicateSymbol "http://www.example.com/pat12#had";
>            rdf:objectSymbol "http://www.example.com/pat12#littleLamb".
>
>(I've used "objectSymbol" as you can have "objectLiteral" as well)
>
>We can then state that for example passing a given file gives you
>among other things that:
>
><mgoose.rdf>   ex:includes ex:s1.
>
>So s1 is just an abstract statement,and we have established
>a relationship with a source document.  This latter triple, if you like,
>is an expression of the stating.

Hmmm. But where exactly do you attach the provenance information? 
Attaching it to the document might be rather coarse, eg suppose I 
wanted to say that a particular triple was modified at a different 
time than the original document was composed.

>
>This is practical.  In cwm, statements are like literals - they can 
>be compared
>and interned.

Interesting. In the reification usages, statings are denoted by 
bnodes, typically. So there seems to be a nice picture emerging here 
where the *form* is indicated by (things like literals) and the 
*actual identity* of a stating is indicated by a bnode, and then the 
reification vocabulary is a way to connect these together. And then 
ex:includes is a way to connect a stating with its containing 
document; and your point is that we might as well think of it as 
relating a stateMENT to the document, and then there is no need to 
introduce the statING at all. (But there has to be some subject for 
the reification triples, right? Whether or not we say it denotes a 
statement or a stating is only marginally relevant to RDF entailment, 
but it doesnt seem to make sense to attach provenance information to 
it if it is supposed to indicate a statement.)

This picture actually indicates an ambiguity which has come up 
elsewhere, eg in datatyping, where we considered one way to interpret 
typed literals as always referring to strings, but allowed the 
possibilty of interpreting

ex:mary ex:age "10" .

as meaning 'Mary's age is *a number which can be represented as* the 
string "10" ', which in effect introduces a kind of ghost bnode in 
between ex:age and "10" to stand for the number. (We didnt do this, 
in the end, but the idea is out there.) The statement way of 
interpreting the reification seems to me to be a bit like this, in 
that you see a direct relationship between a statement and a 
document, where I see a two-step relationship between the statement 
and a stating (expressed by the reification) and between the stating 
and its document (expressed by ex:includes), and the 'missing' bnode 
is the subject of the reification triples.

>When the same statement (or formula of >1 statemement,
>for that matter) occures in many places, it only has to be stored once.

Well, it only has to be described once in either way of doing it, 
since you could always have a ex:sameFormAs property to connect the 
bnodes indicating different tokens of the same form, and just give a 
reification description for one of them.

>It also avoids giving rise to things which one doesn't need to use.

I still don't see what is going to be the subject of the triples 
encoding the provenance information, if reifications denote 
statements.

>
>>Such uses of reification form a large class of actual use cases.
>
>Could you give me pointers?

Er...no, sorry. That was based on memory of the discussions in the 
WG. I think Dan C might be able to give you the pointers (?)

>
>>On the other hand, if someone wants to describe the abstract 
>>grammatical form of a statement, this can be done by using a 
>>particular instance of it as an exemplar for the abstraction, eg by 
>>associating a bnode with the reification by a property like 
>><ex:abstractformOf>.
>
>If there really is a non-abstract form of the statement, then 
>presumably there is
>some modelling of it which we are missing which makes it distinct 
>from the abstract form.
>
>I personally find it quite satisfactory to consider that the 
>statement is the abstract form,
>that, say, a given formula contains a given statement, or a given 
>source supports a given statement,
>but I find no use in the concept of that statement as specified in a 
>given file as opposed
>to the same statement as specified in another file.

Well, OK, I guess you can do it either way. In your way of doing it, 
an OWL engine should be able to conclude that if ex:a and ex:b have 
the same reification description, then they are the same thing. I 
guess it seemed easier to me to allow them to be distinct and then 
they could have other properties, such as a time they were written, 
an author, stuff like that.

>>The de re interpretation allows reifications to use the same 
>>conventions for  interpreting URIrefs as other triples, so that a 
>>URIref used as an actual subject node, and the same URIref used as 
>>the object of an rdf:subject assertion, have the same meaning. In 
>>contrast, the de dicto interpretation would require a reasoner to 
>>distinguish these two uses, using some extra-RDF convention such as 
>>quotation,
>
>The use of a string literal to indicate the URI of a symbol as above 
>is not extra-RDF.

Well, knowing that the literal is a piece of RDF is extra-RDF. In 
fact, if I read the XML Schema specs aright, a string *can't * be a 
URIref. Mind you, I have very little confidence that I am reading 
them aright.

>>  or else require that subjects and objects of reified triples be 
>>given distinct URIrefs which are understood in a 'meta' sense to 
>>refer to the actual URIrefs in the subject and object position of 
>>the reified triple. Neither of these formal techniques have emerged 
>>in practice, suggesting that the de re interpretation in fact 
>>codifies existing intentions of users.
>>
>
>I actually tried the de-re method in generating proofs from cwm, and 
>found that I was generating garbage.  I was generating things which 
>looked nice on paper but which were logical nonsense.
>(The proof-checker would be working through nested formulae and then 
>come across a small sheep!)

:-) Ive often wondered what a LISP interpreter would do if it found 
that (cadr X) was a small sheep.

>
>>One objection to the de re interpretation is that it does not allow 
>>for the adequate representation of propositional attitudes such as 
>>belief. This is controversial (see the discussion of the Russellian 
>>theory in http://plato.stanford.edu/entries/prop-attitude-reports/) 
>>, but in any case there is ample experience which suggests that the 
>>de dicto interpretation would produce other problems with the 
>>representation of such ideas, and that an fully adequate 
>>representation of propositional attitudes is unobtainable using 
>>reification alone.
>
>Well, its certainly unattainable using the de-re interpretation, and 
>without some quoting
>of the symbols used.
>
>But surely, the whole reason for the introduction of reification was to quote.

I will have to defer on that one. I have no idea why reification was 
introduced, and would prefer to not touch reification with a barge 
pole, myself. One of the first things that the CL group did was to 
decide to take quotation *out* of KIF.

But if that was the point of reification, why not just put a form of 
quotation into RDF directly? Its SO much easier to follow than having 
a clunky meta-description, and it avoids a host of problems (eg what 
sense can be made of a partial meta-description? Its syntactically 
impossible to have a partial quote.)

>The phrase "propositional attitudes" is a scary one suggesting one models
>degrees of human belief, and modeling of intelligence and rather more
>philosophy than one would want to get into.

The phrase is used in linguistics, not meaning to get impossibly 
ambitious. Just getting the basic logic of things like the Superman 
examples right is already enough of a problem.

>In fact, we are using quoting (ie de-dicto) to make quite mechanical 
>and simple statements
>about what a file parses to, what a rule implies, and so on.  From 
>the programming
>point of view it looks straightforward and I haven't found the logical
>inconsistencies which the de-re interpretation produces.

Well, Im having trouble already. How can you use reification to 
simultaneously mean quotation *and* to talk about what a rule 
implies? The conclusion of a rule (= Horn clause) isn't (usually) 
quoted. Are you sure that there is a single interpretation of 
reification being used for both cases? Maybe the inconsistencies come 
from there being more than one sense and them getting muddled inside 
proofs (? I don't mean to toss mud here, just that this is 
notoriously hard to keep straight, particularly when things get 
complicated.)

>To be able to talk about the formula { :Ora :wrote :MobyDick} abstractly,
>without talking about Ora.

That is OK, but the example in the M&S is where Ora said {some RDF}, 
and its supposed to be interpreted in way in which you ARE talking 
about Ora, but the thing that Ora is talking about is reified. Its 
this kind of mixing that seems to need at least some de re style 
interpretation. And  often you want to say that Ora said something 
*about Frank*, and then its surely the de re sense of 'Frank' that 
you have in mind, and if you have to fully quote what Ora said then 
there's no way to link it with Frank (as opposed to 'Frank'; you 
can't even assume that 'Frank' means Frank, because Ora may have a 
Lois Lane thing going with Frank for all we know.)

>We have found that serious use of semantic web data often involves 
>an awareness of the provenance of data.  The formulae in N3 are a 
>form of quoting. A way of
>giving a formula in RDF by reifying it (describing it in RDF) would 
>be an interesting and
>useful alternative to the use of the {}  or parseType="Quote" syntax 
>extensions.

I vastly prefer { } to reification, myself. Adding { } to RDF would 
be a huge improvement which would have solved a host of the 
'layering' problems with OWL. Reification only made them worse. If we 
had provided a genuine model theory for RDF reification there would 
probably have been a riot within Webont, because God alone knows what 
would happen if you tried to use OWL operations on the reification 
vocabulary (eg imagine a lower-bound cardinality restriction on 
rdf:Subject...) .

>The query languages involve RDF which is not asserted, and can be 
>implemented as quoted,
>and the return of bindings from a query involves mentioning the 
>variable names.
>There are many practical ways in which quoting is needed.

No, really, one doesnt need quotation *in the logic* in order to 
implement bindings and such stuff. Of course from the implementation 
point of view all of RDF looks as though it is quoted, because the 
code is treating all of the language as objects to manipulate. That 
is normal. But that's not the same as putting those quotes into the 
assertions themselves.

>I had the impression that
>that was the raison d'être of the reification in the spec, but one 
>can only guess at intent.

Indeed. But the examples in the M&S seem to me to have more to do 
with provenance and indirect attribution (Ora said that...) than 
implementation of reasoners.

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola              			(850)202 4440   fax
FL 32501           				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam
Received on Thursday, 13 March 2003 17:48:24 UTC