SEM: Layering from Smith, Michael K on 2002-04-19 (www-webont-wg@w3.org from April 2002)

From: Smith, Michael K <michael.smith@eds.com>
Date: Fri, 19 Apr 2002 14:20:34 -0500
To: www-webont-wg@w3.org
Message-ID: <B8E84F4D9F65D411803500508BE322140D3F8AA6@USPLM207>
First, I think the approach that the WG has been following lately, that of
trying to describe the semantic elements of OWL in an abstract syntax is
exactly right.

I argue here for two properties of the OWL/RDF layering.  The first is not
particularly startling.  The second is at least phrased differently than
other discussions. 

Consider two mapping functions between the abstract syntax layers. T maps
OWL to RDF and TINV maps RDF to OWL.

I.  Mapping from OWL to RDF and back can be done without loss of
    information, but not vice-versa.  (Presuming some form of dark
    triples in RDF.  In the worst case we can fall back on the 
    subject-predicate-object circumlocution.) 

II. The layering relation we want to strive for, tying abstract syntax
    and semantics together, is that

      if RDF-ENTAILS(T(a),d) then OWL-ENTAILS(a,TINV(d)) 

    That is, we map down from OWL to RDF via T(a), which entails d, 
    and then map d back up to OWL via TINV(d).  
    In OWL, a should entail TINV(d). 

-------------------------------------------------------------------
I. MAPPING THE SYNTAX

First, assume that both RDF and OWL have their own abstract syntax. We don't
at the moment care what they are or whether they look anything alike.

type RS = set of RDFStmt
type OS = set of OWLStmt

Let T    be a function that maps elements of OS to RS
Let TINV be a function that maps elements of RS to OS 

Despite the name, TINV is not necessarily the inverse of T.  And neither of
these functions needs to be information preserving.

Define '==' to mean equivalent in the sense that there is no loss of
entailment:

      a == b
  iff (all y : entails(a,y) iff entails(b,y))

Consider '==' and 'entails' overloaded, to cover both 'RDF-entails' and
'OWL-entails', which of course will not be the same.

The key issue here is whether the various mappings between OWL and RDF lose
information.  And, if so, how?


SYNTAX PROPERTY 1. Does TINV invert T?  That is, is it the case that

   TINV(T(os)) == os?

where os is of type OS.  With RDF dark triples or quoting we should be able
to do this.

In some of our earlier discussions it seemed to have been taken for granted
that TINV and T are identity functions.  They cannot be (see below).


SYNTAX PROPERTY 2. Does T invert TINV?  That is, is it the case that

   T(TINV(rs)) == rs?

We can be pretty sure this relation will not hold.  Consider for example Pat
Hayes' OWL:NIL element to terminate OWL lists.  If OWL uses OWL:NIL to
terminate a list, then RDF with elements beyond an occurrence of OWL:NIL
would have no interpretation in OWL and we would not expect even a simple
syntactic round-trip to work.

Going from the RDF

 x rdf:_1 a
 x rdf:_2 OWL:NIL
 x rdf:_3 b

to OWL and back will yield

 x rdf:_1 a
 x rdf:_2 OWL:NIL


SUMMARY:  The best we can hope to achieve is 1 above.  


----------------------------------------------------------------
II. LAYERING THE SEMANTICS

Note that according to the discussion above there are RDF statements
(triples) that will have no corresponding interpretation in OWL. Based on
this, I don't see any way around the "superset of a subset" semantic
relation.  By itself this is a pretty vacuous.  But a clear description of
what is covered and not covered by the translation process should mitigate
this.

Let the predicate RDF-ENTAILS(x:RS,y:RS) assert that x entails y according
to the RDF semantics.  

Let the predicate OWL-ENTAILS(x:OS,y:OS) assert that x entails y according
to the OWL semantics.

RDF-ENTAILS(x,null) is true.
OWL-ENTAILS(x,null) is true.

Most of the questions we are asking can be visualized in terms of the
following diagram.  (At least it helps me.)

      a:OS  == OWL-ENTAILS ==>  b:OS
      | ^                       | ^
      | |                       | |
    T | | TINV                T | | TINV
      | |                       | |
      v |                       v |
      c:RS  == RDF-ENTAILS ==>  d:RS

We are interested in those cases where if something is entailed in RDF, then
some related thing is entailed in OWL.

In general we have not recognized a need to ensure that if something is
entailed in OWL, then some related thing is entailed in RDF.  We know that
this is not generally possible.

Given this, there are four cases that we can consider.


SEMANTIC PROPERTY 1. 

  If RDF-ENTAILS(c,d) then OWL-ENTAILs(TINV(c),TINV(d))

This would seem to say that any entailment in RDF is preserved in OWL. But
this is where the syntactic transformation, TINV, has a critical effect.
One obvious difficulty here is collections.  If Li is later in collection C
than an OWL:NIL element, then the RDF fact that Li is a member of C will be
lost in OWL.

More importantly, it is fairly easy to imagine a TINV that loses information
critical to the inference. E.g.

     RDF-ENTAILS({a,b,c,d},{e})
     TINV({a,b,c,d}) = {a',d'} and TINV({e}) = {e'}
     not OWL-ENTAILs({a',d'},{e'})

because b and c were the critical links in the entailment of e.


SEMANTIC PROPERTY 2. 

  If RDF-ENTAILS(T(a),d) then OWL-ENTAILS(a,TINV(d))

Any OWL that translates to RDF preserves RDF entailment.  This is a property
I think we would like to ensure if we can.  The reason we might imagine this
working is that we are restricting the precondition to things that translate
from OWL to RDF, e.g. T(a).  So we don't need to account for every possible
RDF inference, just those based on something that could be derived from OWL.

But, here again we have a problem that mixes syntax and semantics.  If
TINV(d) = null, this is not very interesting.  What we want is for SYNTAX
CASE 1 to hold, that is TINV(T(a)) == a.

       a   == OWL-ENTAILS ==> TINV(d)
       |                        ^
       |                        |
     T |                        | TINV
       |                        |
       v                        |
      T(a) == RDF-ENTAILS ==>   d

This model is very much like the standard relationship between two
hierarchically related programming languages.  Consider Pascal vs. machine
code. There is a lot that can be said about the state of memory at the
machine code level that has no corollary at the Pascal level. What is
important is that we can map down to the hardware and then back up to the
data structures of the higher level programming language. 


SEMANTIC PROPERTY 3. 

  If RDF-ENTAILS(c,T(b)) then OWL-ENTAILs(TINV(c),b)

SEMANTIC PROPERTY 4.

  If RDF-ENTAILs(T(a),T(b)) then OWL-ENTAILS(a,b)

Neither 3 nor 4 is going to be possible, given that OWL will entail things
that cannot be entailed in RDF, but that can still be translated to RDF.
(Assuming some quoting/dark mechanism.)

Put another way, 

 forall x : 
  if   T(x) is dark 
  then RDF-ENTAILS(c,T(b)) => RDF-ENTAILS(c,T(b union x))

But many of these dark terms will have an interpretation in OWL, but will
not be entailed by TINV(c) (case 3) nor by a (case 4).


- Mike

Michael K. Smith
EDS Austin Innovation Centre
98 San Jacinto, #500
Austin, TX 78701
512 404-6683
Received on Friday, 19 April 2002 15:20:41 UTC