Defining n3's formula literals from Sandro Hawke on 2002-09-06 (www-rdf-rules@w3.org from September 2002)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 06 Sep 2002 07:33:28 -0400
To: www-rdf-rules@w3.org, timbl@w3.org, connolly@w3.org
Message-Id: <200209061133.g86BXSV22505@wadimousa.hawke.org>
I've been trying to understand the meaning of n3's formula literals.
These use to be called "contexts"; they are bits if n3 wrapped in
curly braces.  They appear syntactically as object identifers, like
URIRefs, string literals, and lists.

I believe that curly braces function as an operator which reifies a
de-labeled version of the formula inside the braces, returning an
identifier for that (delabeled and reified) formula.  Both
delabeling (defined below) and reification are unusual and tedious
procedures; this e-mail gets a little unusual and tedious, too, I'm
afraid.

But the message is this: a curly-brace expression in n3 denotes a
logical formula.  By denoting a formula (which is a syntactic
construct), we're puting the syntax of the language into the domain of
discourse.  This make the semantics of n3 quite a bit more
complicated, but it seems necessary for many Semantic Web
applications.

** The Simple Case

The most common use for formula literals is as operands for
log:implies.  Log:implies, in this view, operates on reified formula
arguments, so the reification process and vocabulary is hidden.  I
will borrow terms from my LX reification vocabulary [1] for the
examples below; one doesn't normally need such a vocabulary and none
has previously been defined for n3, because of the way log:implies
matches inside the reification.

The common case of using log:implies can be handled without thinking
about reification at all:
 
   { :a :b :c } log:implies { :d :e :f }

can be translated to first-order logic (using Otter's syntax) as
 
   rdf(a,b,c) -> rdf(d,e,f).

or to a rules language (using prolog syntax) as

   rdf(d,e,f) :- rdf(a,b,c).

(Various other issues raised in these translations, mostly about
URIRefs, string literals, and whether log:implies is a rule
constructor or the material conditional are left to other messages.
Here, we're focusing on formula literals.)

So using formula literals directly with log:implies is not big deal,
but there are other uses. 

For instance, people often talk about Semantic Web operations
involving reasoning about trust and beliefs.  There is an interest in
rules like "If I trust John, then everything John says is true."  N3
formula literals are expected to be used like this (and several uses
of cwm do this kind of meta-reasoning).  If we're expressing knowledge
about the ways we're expressing knowledge, we need to use reification.

** EXAMPLE 1: SINGLE TRIPLE

Let's say: There exists something we'll call _:x and some naming
property we'll call _:name.  John says that _:x has the name "Spot".

   # Example 1a (N3 with formula literals)
   :John :says { _:x  _:name "Spot" }

is equivalent to

   # Example 1b (N3 without formula literals)
   :John :says _:s.
   _:s rdf:type lx:Triple.
   _:s lx:subject _:x.
   _:s lx:predicate _:name.
   _:s lx:object "Spot".

[[ First-Order Logic Digression 
-
-  If we imagine extending otter's FOL syntax to include this kind of
-  formula literal (and string literals for this example), we could say
-    
-     # Example 1c (Otter plus formula literals and string literals)
-     exists x name ( 
-       trueTriple(John, says, { trueTriple(x, name, "Spot") } ) 
-     )
-     
-  which has an obvious translation, using FOL's function terms
-  
-     # Example 1d (Otter plus string literals)
-     exists x name ( 
-       trueTriple(John, says, triple(x, name, "Spot") )
-     )
]]

** EXAMPLE 2: TWO TRIPLES

Let's say:  John says that _:x has the name
"Spot" and x is a dog. 

   # Example 2a (N3 with formula literals)
   :John :says { _:x  _:name "Spot".  _:x a _:Dog}

is equivalent to

   # Example 2b (N3 without formula literals)
   :John :says _:s.
   _:s rdf:type lx:Conjunction
   _:s lx:conjLeft :s1.
   _:s lx:conjRight :s2.
   _:s1 rdf:type lx:Triple.
   _:s1 lx:subject _:x.
   _:s1 lx:predicate _:name.
   _:s1 lx:object "Spot".
   _:s2 rdf:type lx:Triple.
   _:s2 lx:subject _:x.
   _:s2 lx:predicate rdf:type.
   _:s2 lx:object _:Dog.

[[ First-Order Logic Digression 
-    
-     # Example 2c (Otter plus formula literals and string literals)
-     exists x ( 
-       trueTriple(John, says, { 
-         triple(x, name, "Spot") &
-         triple(x, rdf_type, Dog)
-       }
-     )
-     
-  which has an obvious translation, using FOL's function terms
-  
-     # Example 2d (Otter plus string literals)
-     exists x ( 
-       trueTriple(John, says, 
-         and(triple(x, name, "Spot"), triple(x, rdf_type, Dog))
-       )
-     )
-  
-  "trueTriple(a,b,c)" is an atomic sentence which is true if and only if
-  the RDF triple relationship is held between a, b, and c.
-  "triple(a,b,c)" is a function mapping a, b, and c to something
-  representing the ordered combination (tuple) of a, b, and c.  We can
-  imagine a predicate "true", defined so that true(triple(a,b,c)) would
-  be true if and only if trueTriple(a,b,c), but we'll have to do that in
-  a rather complex way to avoid problems with self-reference.
]]

** EXAMPLE 3: Superman

Let's say "Lois believes 'Superman can fly'".  

This is a classic example illustrating the need for opacity [2].  The
issue here is whether knowing that Superman is Clark Kent will lead us
to the (unfounded) conclusion that "Lois believes 'Clark Kent can
fly'".  

The n3 looks like this:

   dc:lois :believes { dc:Superman :can dc:fly }.
   dc:Superman :equal dc:ClarkKent.

and we define :equal like this:

   this log:forAll :S, :P, :O, :X, :Y.
   { :S :P :O.  :S :equal :X } log:implies { :X :P :O }.
   { :S :P :O.  :P :equal :X } log:implies { :S :X :O }.
   { :S :P :O.  :O :equal :X } log:implies { :S :P :X }.
   { :X :equal :Y } log:implies { :Y :equal :X }.

cwm (TimBL's n3 processor) correctly does not infer

   dc:lois :believes { dc:ClarkKent :can dc:fly }

because the definition of :equals does not reach into formula
literals.

To shield the reification of Lois's belief from equality substitution,
we need to move the symbols into string literals.  We can do this by
"de-labeling" the RDF graph inside the braces.

De-labling is the process of turning RDF nodes with URIRef labels into
unlabelled nodes (bNodes), while keeping the label information as a
new arc to a string literal containing the old label text.  (There is
no consensus on what to label the arc.  I have argued that it ought to
be rdf:about, but for now let's use lx:uriref.)

De-labeling the graph

   _:x <http://example.com/#name> "Spot".

we get
 
   _:x _:y "Spot".
   _:y lx:uriref "http://example.com/#name".

De-labeling an RDF graph gives us a graph with up to four times as
many triples using only bNodes, string literals, and the lx:uriref
URIRef.   Re-labeling is of course possible.  

If we delabel the contents of a formula literal before we reify it, we
provide the appropriate opacity.

   @prefix dc: <http://dccomics.example.com/terms#> .
   @prefix : <http://example.com/#>
   dc:lois :believes { dc:Superman :can dc:fly }.

is delabeled to

   dc:lois :believes { 
     _:superman _:can _:fly.
     _:superman lx:uriref "http://dccomics.example.com/terms#Superman".
     _:can lx:uriref "http://example.com/#can".
     _:fly lx:uriref "http://dccomics.example.com/terms#fly" 
   }.

Notice how this differs from 

   dc:lois :believes {_:superman _:can _:fly. }.
   _:superman lx:uriref "http://dccomics.example.com/terms#Superman".
   _:can lx:uriref "http://example.com/#can".
   _:fly lx:uriref "http://dccomics.example.com/terms#fly" 

where we're asserting the URIRef for the thing Luis thinks can fly, as
opposed to letting Lois do it.  It's the difference between "Lois
thinks that something she calls 'Superman' can fly" and "Lois thinks
that thing called 'Superman' can fly."

The reification in triples of Lois's belief is daunting.  It starts:

   dc:load :believes _:s.
   _:s rdf:type lx:Conjunction.
   _:s lx:conjLeft :s1.
   _:s lx:conjRight :s2.
   _:s1 rdf:type lx:Triple.
   _:s1 lx:subject _:superman.
   _:s1 lx:predicate _:can.
   _:s1 lx:object _:fly.
   _:s2 rdf:type lx:Conjunction.
   _:s2 lx:conjLeft :s3.
   _:s2 lx:conjLeft :s4.
   _:s3 rdf:type lx:Triple.
   _:s3 lx:subject _:superman.
   _:s3 lx:predicate lx:uriref.
   _:s3 lx:object "http://dccomics.example.com/terms#Superman".
   ...

This form relies on the knowledge that the bNodes like _:superman do
no occur in the overall RDF graph.  If we could not safely choose such
bNodes (or Skolem constants), we could instead reify an existential
quanitification (which would be about 8 triples larger).
   
[[ First-Order Logic Digression 
-  
-  The FOL/otter form of this is slightly more readable:
-  
-      trueTriple(lois, believes, { trueTriple(Superman, can, fly) } )
-  
-  turns into 
-  
-      exists s c f (
-        trueTriple(lois, believes, 
-           and(triple(s, c, f),
-  	     and(triple(s, lx_uriref, "Superman").
-  	         and ... 
-                  )
-              )
-         )
-      )
-  
-  ....
]]

To be continued?   This is long enough for now.   It's been a very
complex message because I explained and demonstrated both de-labeling
and reification.  Formula literals themselves are pretty trivial if
you're comfortable with those underlying operations.  

I should say that cwm doesn't do this kind of reification; the code
just keeps the formula literals off to the side, tagged as being in a
separate space from the formula which uses them.  This stuff isn't
that hard to implement.  With LISP-style structures, it's even
easier, the only real complication being in delabeling.

I thought I would go into dealing with the self-reference problems
here, but this has gone on too long.  As I've discussed a bit lately
[3], I think we can chart a course around the problems by having
reification not capable of describing self-referencing sentences.
N3's "this" keyword should be used only for variable scoping, as pure
syntax, and not to create self-referencing formulas.  Document-level
self-referencing, like "[ is log:semantics of <> ]" should be seen as
outside the core logic (which is my way of saying I'd rather not think
about it yet).

     -- sandro

[1] http://www.w3.org/2002/08/LX/RDF/layering
[2] http://lists.w3.org/Archives/Public/www-rdf-logic/2001Apr/0074.html
[3] http://lists.w3.org/Archives/Public/www-rdf-interest/2002Aug/0185.html
Received on Friday, 6 September 2002 07:35:55 UTC