RE: A modest proposal for reforming RDF

I misred the distribution list when replying to Pat.
Here is a copy:

Pat,

I notice that you didn't copy your reply to the rdf-logic
list.  If this was an omission I'm happy for you to forward
this reply to this list in retrospect.

> -----Original Message-----
> From: pat hayes [mailto:phayes@ai.uwf.edu]
> Sent: 22 December 2000 05:54
> To: McBride, Brian
> Subject: RE: A modest proposal for reforming RDF
> 
> 
> >Drew,
> >
> >Your document asserts:
> >
> >  The disadvantage is that we can't assert a complex
> >  expression without asserting its parts.
> >
> >I'm not convinced this is true.  This can be done in
> >RDF, but requires reification, whose syntax is, shall
> >we say, verbose.
> 
> And whose semantics is, shall we say, opaque; and whose pragmatics 
> is, shall we say, utterly confused.

One of things I am arguing for in the W3C Semantic Web proposed
activity is that the current RDF specs must be cleaned up.  Far
to much energy is being wasted trying to figure out what they
mean.

> Reification (as interpreted by 
> Ora Lassila) allows one to describe an expression. It doesnt allow 
> one to do anything with the *content* of that expression, however. So 
> for example, to say that A implies B, reification by itself is no use 
> whatever. The nearest you could come to it would be to assert that 
> "A" imples"B", which is almost always logically false.
> 
> >It is, as you have done, important to separate the RDF
> >XML syntax and the underlying data model.  It would be
> >helpful to me to understand whether your proposal to
> >reform RDF is motivated by dislike of the syntax, or
> >by the data model being insufficiently expressive.
> 
> There is a third possibility, which is the core problem with RDF. The 
> syntax is ugly, but that is not really important;

The unimportance of the syntax was what I was suggesting.

> the data model is 
> limited but sensible.  But the relationship between the syntax and 
> the model is not clearly described,

In which case, the problem could be resolved by defining a
new syntax and properly defining its relationship to the data
model (better a tidied up data model).

Actually, I think your problem is with the datamodel itself,
at least as I understand the term datamodel.  Its the fact
that reification is the only mechanism that might allow
the construction of compound logical expressions that
causes the problem, and that is part of the data model.

> and seems to be based on a 
> fundamental confusion between use and mention.
> 
> >If the problem is the syntax, then might a new
> >syntax for the underlying data model solve the problem.
> >
> >If the problem is the expressiveness of the data model,
> >I'd really appreciate an example to help me understand
> >its limitations.
> 
> There is a basic distinction (common throughout logic and almost all 
> analyses of language) between  expressions (which refer to something) 
> and their meanings (what they refer to).  Languages conventionally 
> use expressions to refer to things (in some broad sense of 'thing' 
> that includes truthvalues, functions, etc.) . This applies just as 
> well to subexpressions of more complex expressions, so that if for 
> example one says  (A implies B), one is saying something about what A 
> and B  refer to, not about the expressions A and B themselves.  Now, 
> reification is a way of using the language to refer to expressions, 
> which is fine. But notice it is the *expressions* which are referred 
> to, not the meanings of the expressions. So one does not capture the 
> meaning of (A implies B) (or any other complex expression) by 
> reifying A and reifying B and then saying something about the 
> reifications.
> 
> Now, the RDF literature and specs are so sloppily worded that cannot 
> tell quite what the intended meaning of reification actually is.

It does take careful reading, but m&s does say:

  o a statement is a triple (p, s, o) i.e. an abstract thing

  o a reified statement is a resource which 'represents' a statement

  o the reification of a statement is a collection of four
    statements ...

So a reified statement represents a statement, not its quoting.
We are going to need something more formal here, to be clear.
I'm not a logician, so please forgive the naivity.

Lets define a semantics for RDF in terms of a translation to FOL.

Begin with an RDF graph and define its meaning in terms of an FOL
expression.  Let the graph be represented by a set of statements
G = {S1, S2, ..., Sn} [finite] where Si = (Pi, SUi, OBi).

Define 3 functions:

  Translation(G)     - a translation of G to a FOL expression
  Interpretation(G)  - an interpretation of G 
  Interpretation'(E) - an interpretation of an FOL expression

Define:

  Interpretation(G) = Interpretation'(Translation(G))

Then define Translation(G) to be:

  P'1(SU'1, OB'1) AND P'2(SU'2, OB'2) AND ... AND P'n(SU'n, OB'n)

and   Interpretation(Pi)  = Interpretation'(P'i)
      Interpretation(SUi) = Interpreation'(SU'i)
      Interpretation(OBi) = Interpetation'(OB'i)

This is just about enough machinery.  A statement gets translated
to a predicate.  A reified statement represents a statement (not
its quoting) so a reified statement also represents a predicate.
QED?

Filling in the detail a bit, takes IMPLIES as a predicate:

define 

   Translation((IMPLIES, RSi, RSj)) = Translation(RSi) => Translation(RSj)

If G contains exactly one statement each of the form:

   (RDF:type, RSi, RDF:Statement)
   (RDF:subject, RSi, RSUi)
   (RDF:predicate, RSi, RPi)
   (RDF:object, RSi, ROBi)

then

  Translation(RSi) = RP'i(RSU'i, ROB'i)

otherwise

  Translation(RSi) = RSi' where RSi' is a variable.

And its done.  There is no special casing here for the
translation of the RSi when it is the subject or object
of a statement.  RSi is *always* translated this way.

I claim these semantics define what m&s describes about
reification.

For an example, consider IMPLIES(AND(OR(A,B),C),OR(A,B))

This could be represented by the graph G = 

{
  (RDF:type,      RSOR,  RDF:Statement)
  (RDF;subject,   RSOR,  A)
  (RDF:predicate, RSOR,  OR)
  (RDF:object,    RSOR,  B)
  (RDF:type,      RSAND, RDF:Statement)
  (RDF;subject,   RSAND, RSOR)
  (RDF:predicate, RSAND, AND)
  (RDF:object,    RSAND, C)
  (IMPLIES,       RSAND, RSOR)
}

[Actually, I can't make my mind up between two different
 representation/semantics - so I'm trying this one for now]


Translation(RSOR) = OR'(Translation(A), Translation(B))
                  = OR'(A', B')

Translation(RSAND) = AND'(Translation(RSOR), Translation(C))
                   = AND'(OR'(A',B'), C')

Translation((IMPLIES, RSAND, RSOR)) 
  = Translation(RSAND) => Translation(RSOR)
  = AND'(OR'(A',B'), C') => OR'(A', B')

Seems to work, and no multiple reification as Drew 
was concerned about.  So what am I missing?

The translation of the full graph does contain a load of junk,
but some mechanism to filter it down must be possible.

> I am 
> basing my understanding on what Ora Lassila tells me.

When I say that the RDF specs define a data model, but do
not formally define a semantics, (they kinda hint at one)
what I mean is "PLEASE, PLEASE, will the logicians define
something sensible, which is consistent with the current
specs".

> However, so 
> many people seem to think that one CAN use reification to capture the 
> meaning of complex expressions, that I am left wondering quite what 
> the intended semantics of RDF actually should be taken to be.

I suggest the semantics are those I have tried to express above.

> It may 
> be that there is a rather subtle use/mention distinction involved 
> here, where an expression is described by a reification-quad, but 
> when referred to as a component of a larger assertion it is then 
> (implicitly) de-reified. If so, that would indeed give RDF the 
> expressive power that its users think it has,

The statement/quoting distinction is subtle.  I don't think
it can be discussed usefully in natural language - it just
gets confused.  Formality is needed.  I'm not sure I have
seen or understood all the exchanges between you and Ora
on this.  Personally, I'd like to see an open discussion in
formal language (that neophytes have a hope of following)
on this issue.

> but at a much larger 
> cost than merely an awkward syntax.

I'm not sure what you mean by cost.  All those extra triples look
expensive to implement don't they.  But the implementors are
coming up with ways of representing reification efficiently.
Or is that not the sort of cost you had in mind.
> 
> Pat Hayes
> 
> ---------------------------------------------------------------------
> IHMC					(850)434 8903   home
> 40 South Alcaniz St.			(850)202 4416   office
> Pensacola,  FL 32501			(850)202 4440   fax
> phayes@ai.uwf.edu 
> http://www.coginst.uwf.edu/~phayes
> 

Brian McBride
HPLabs

Received on Friday, 29 December 2000 11:01:45 UTC