RDF datatypes and "magic" properties in CWM

Hi,

<background>
I've been thinking about how to implement an (extensible) datatype 
inferencing framework in my Swish software [1].  This is part of a project 
I started in part because, for example, I couldn't figure out how to do 
things like IP address testing in CWM [2] (e.g. is address a.b.c.d part of 
the IP subnet m.n.o.p/24?).

Unlike CWM, I've separated the scheme for defining inference rules from the 
language over which inference is performed -- this led me to using Haskell 
as my implementation language, as it is sufficiently close to formal 
specification (pure functional and strongly datatyped) yet provides me with 
the full power of a programming language in exploring the patterns of 
inference rule definition.  This gives me a lot of flexibility, and begs 
questions of how [not] to use it.

My basic pattern for inference is a query (rule antecedent as query against 
given graph) which yields a set of sets of variable bindings, which are 
then back-substituted into the rule consequent to yield deductions that can 
be merged with the original.  I guess this is a common enough 
pattern.  There's a similar pattern for backward-chaining a rule.  The 
"clever stuff" is handled by filtering and manipulation of the variable 
bindings that result from a query.  So, for example, to implement rule 
'rdfr1' from the RDF semantics specification [3], I can use an antecedent 
of the form:
     ?a ?p ?l .
and impose an additional condition on the variable binding that ?l must be 
bound to a n XML literal, from which the consequent:
     ?a ?p ?b .
     ?b rdf:type rdfs:Literal .
The "allocated to" aspect of this rule (cf. [3]) is provided by a variable 
binding modifier that creates a new binding for ?b for each distinct 
binding of ?l.  This is all handled by specific Haskell code.

(For the curious, the rule specification in Haskell looks like this:
[[
rdfr2 = makeRDFClosureInstanceRule scopeRDF "r2"
             "?x  ?a ?l . "
             "?x  ?a ?b . ?b rdf:type rdf:XMLLiteral ."
             (isXMLLit "?l")
             (allocateTo "?b" "?l")
]]
)
</background>

...

It seems to me that under datatyped interpretations additional entailments 
are valid, hence additional inference rules can be used.  I imagine that 
for each datatype recognized, there are some additional inference rules one 
might use.  And yet more for cross-datatype inferences when combinations of 
datatypes are recognized.  I'm trying to understand implementation 
strategies for datatypes in these terms.

My understanding of CWM is that datatype-related inferences are accessed by 
"magic" built-in properties that are used only in the antecedent of a 
rule.  It seems that you thereby have a notation for defining some kinds of 
inference rule schema;  e.g.

  { :vehicle :standingCapacity ?x ; :seatedCapacity ?y .
    (?x ?y) math:plus ?z . }
=>
  { :vehicle :totalCapacity ?z . }

which may be viewed as describing a family of inference rules:

{ :vehicle :standingCapacity 1 ; :seatedCapacity 1 . }
=> { :vehicle :totalCapacity 2 . }

{ :vehicle :standingCapacity 1 ; :seatedCapacity 2 . }
=> { :vehicle :totalCapacity 3 . }

  :
etc.

...

In thinking about this, the questions I ask include:

 From the basis of experience with CWM implementation, does this seem a 
reasonable viewpoint?

What are alternative viewpoints concerning the role of dataytypes in RDF?

Does the above characterization have any bearing on the view sometimes 
expressed that properties have a somehow more fundamental role in the 
meaning of RDF?  (I ask this because special properties seem to be the key 
mechanism whereby new "understanding" is added to CWM.)

Is CWM able to detect datatype clashes in any circumstances?
E.g. the graph:
     <ex:a> <ex:b> "25"^^xsd:decimal .
     <ex:b> rdfs:range xsd:string .
is unsatisfiable in an {xsd:decimal,xsd:string}-interpretation, because 
xsd:integer and xsd:string have disjoint value spaces.  And if so, what 
does CWM do about it, if anything?

...

My current thinking is that the extension of inference rules might be 
achieved by functions applied to the variable bindings rather than 
additional antecedent terms.  But I'm still trying to figure out how to 
abstract out the common idea of (say) '+' applied to numbers.

#g
--

[1] http://www.ninebynine.org/Software/Intro.html#Swish

(This is an old and very basic version of the software.  I've added much 
more but it is, metaphorically speaking, still in bits all over the garage 
floor, hence in no useful state to publish.)

[2] http://www.w3.org/2000/10/swap/doc/cwm.html

[3] http://www.w3.org/TR/rdf-mt/#RDFRules


------------
Graham Klyne
GK@NineByNine.org

Received on Friday, 3 October 2003 14:11:12 UTC