Re: range, domain: Conjunctive AND disjunctive semantics both supportable from Wolfram Conen on 2000-10-01 (www-rdf-interest@w3.org from October 2000)

From: Wolfram Conen <conen@wi-inf.uni-essen.de>
Date: Sun, 01 Oct 2000 19:29:58 +0200
To: Jan Grant <Jan.Grant@bristol.ac.uk>
CC: www-rdf-interest@w3.org
Message-ID: <39D77496.9C736544@wi-inf.uni-essen.de>
Dear Jan and dear RDFS folks interested in range constraint discussion,

We'll try to give some general semantics for range constraints
below. Read only, if you like precise semantics and are still
interested further thoughts on range constraints.

To start, let's briefly recapitulate some facts from RDFS:

(1) At most one range constraint is allowed. 

(2) There are only two "distinguished" sets of entities, LITERALS and
    RESOURCES.

(3) There is a SUBCLASSING "operation" with the semantics:
    instanceOf(x,A) AND subClassOf (A,B) => instanceOf(x,B)

We will make two claims in the following:

(C1) Disjunctive/Conjunctive range constraints is not all that would be
needed! General set-based range constraints should be able to constrain
the range to an arbitrary set-algebraic expression.

(C2) (Exactly) one range constraints is sufficient to express any
set-related range contraint, if sets can be constructed from
set-algebraic operations.

and draw something like a conclusion, outlining a RDFS spec conform
solution and a "semantically" spiced-up solution (that can still be
expressed in 3-ary triples) at the end of the Email (Jan, is this
logic-stuff really a burden? Isn't it simply a nice vehicle to "say what
one means" - with fast implementations and standardized languages
available?)  

Best regards,
     Wolfram    and Reinhold   
     conen@gmx.de,  klapsing@wi-inf.uni-essen.de

-- Claims:

Claim 1:
--------
Consider the following situation: The are two classes, C1 and C2, and a
property P. What set-related range constraints are possible? (n stands
for intersection, u for union, \ for difference, ! for not)

for (x,p,y), range(p,Exp) may constrain y to be from
    Exp := C1 u C2   (x in C1 OR x in C2)  
    Exp := C1 n C2   (x in C1 AND x in C2)
    Exp := C1 \ C2   (x in C1 AND x NOT in C2)
    Exp := C2 \ C1   (x in C2 AND x NOT in C1)
    Exp := (C1 \ C2) u (C2 \ C1)   ( x in C1 XOR x in C2)

Additionally, the "environment" of a set C1 can be considered.
    Exp := !(C1)   (x not in C1)
This can be extended to two classes, C1, C2, for example:
    Exp := !(C1 u C2) etc. (not necessary to mention 
	it specifically, see below)


I guess that for all cases, useful examples can be found. Accepting this
means that allowing "Multiple" range constraint AND fixing acertain
interpretation of the range constraints (let's say C1 n C2) makes it
difficult to handle the other cases (which I would not like).

A solution could be to introduce specific range constraints or range
constraint types for all of the above cases. This is, however,
problematic, because it does not scale very good to range dependencies
among 3,4,...,n classes. The next claim may offer a solution:

Claim 2: 
--------
Only one range constraint is sufficient if it is possible to "construct"
classes (or "class expressions") from other classes. In this case, each
range constraint can point to exactly one class and the "construction"
of the class gives the constraint. To do this, the "Exp" above should be
the constructed class and the construction expression is given on the
right hand side of the above expressions. [By the way: "no range
constraint" in RDFS now has the meaning of "Exp := Literal u Resource"
(y from Literal OR Resource)] The problem here is, that, right now, in
the RDFS basic semantics, it is only possible to express the class
relations (2) and (3) (from the facts above) - and
this is not very SHARP:

If we say that range(p,C1), and C1 and C2 are subclasses of a common
parent C, and we know that y is from C2, we can not conclude that the
range constraint is violated (because y can easily be a member of C1 too
(declared somewhere else) -- we are not able to specify that C1 n C2
should be an empty set (ie. C1 and C2 are a partial partition of C). If
we would be able to specify this, we would have much more reason to
believe that with y in C2 it follows that y not in C1. Note that this
fact could still be specified "somewhere" else (even in our own model) -
in this case, we would have a violation of the "emply intersection
constraint", ie. the model would be inconsistent). The only case    
where the assumption that C1 and C2 have an empty intersection is
justified, is with LITERALS and RESOURCES (that is, if x is in Literal
than it "should not" be in RESOURCES -- or, otherwise, the basic
consistency assumption would be violated). In this case, a range
constraint could be "checked" meaningfully. In all other cases, it works
only with "complete knowledge assumption" (closed world).

-- End of Claims.

-- Conclusion:

A "nice-to-have" would be (for example)
  (C1,rdf:type,rdfs:Class)
  (C2,rdf:type,rdfs:Class)
  (A,rdf:type,ConstructedClass)
  (A,isConstructedFrom,"C2 \ C1")

  (P, rdfs:range, A)

  Than, (X, rdf:type, C1) would clearly violate the range constraint.
  Other cases are not so sharp - substitute:

  (A,isConstructedFrom,"C2 n C1")
  (X, rdf:type, B)
  
Here, again, the problem is that we do not know anything about the
relation between A and B and thus, can only infer a range violation, if
we assume that the model is "complete" in the sense that if X would be
in A, this would have been said (first assumption) and  brought to our
attention (second assumption). However, if we would know something
"sharp" (ie., empty intersection) about the relation between B and A
(resp. C1,C2), it might be possible to infer something reasonable.

The above solution would CONFORM to the RDF/RDFS spec, if the object of
"C2 \ C1" is a literal, that is to say that it could be used in a
vocabulary/ in models. However, it would require an application level
check of range/class construction semantics. This is probably not really
nice, because range constraints seem to be too important to leave their
semantics to "proprietary" vocabularies and interpretations, but this
might be a matter of taste.

However, requiring the object of 'isConstructedFrom' to be a literal
does not seem to be a very nice solution with respect to quality of
models. In fact, the property 'isConstructedFrom' denotes a multi-ary
relation between classes. This could (generally) be tranformed into a
sequence of applying (3-ary) "atomic" set-algebraic operations (leading
to 4-tuples):

  Example: A = (C1 n C2) \ C3
  
  ( A1, intersection, [C1,C2] )
  ( A,  difference, [A1, C3] )

In RDF, this would be expressible using reification and a suitable
interpretation of the reified statements:

  ( A1, rdf:type, rdf:Statement )
  ( A1, rdf:subject, C1 )
  ( A1, rdf:predicate, rdfsets:intersection)
  ( A1, rdf:object, C2 )

  ( A, rdf:type, rdf:Statement )
  ( A, rdf:subject, A1 )
  ( A, rdf:predicate, rdfsets:difference)
  ( A, rdf:object, C3 )

  and (for convinience in the definition of semantics)
  ( A1, rdf:type, rdfs:Class )
  ( A,  rdf:type, rdfs:Class )

Leading to the possibility to express set_algebraic_range(p,A) as:

  ( p, rdfsets:range, A)

The semantics, building upon the basic rules given in 
 http://nestroy.wi-inf.uni-essen.de/rdf/logical_interpretation/
could than be:

  in_range(X,P) :- set_algebraic_range(P,A), instanceOfSet(X,A).

  with

  set_algebraic_range(P,A) :- statement(P, rdfssets:range, A).

  /* If it is a 'reifying' class */
  instanceOfSet(X,A) :- Class(A), reifies(A,S,P,O), in(X,S,P,O).

  /* If it is a 'normal' class */
  instanceOfSet(X,A) :- instanceOf(X,A).

  in(X,S,P,O) :- 
        P = rdfs:difference, instanceOfSet(X,S), NOT instanceOfSet(X,O).

  in(X,S,P,O) :- 
        P = rdfs:union, instanceOfSet(X,S).

  in(X,S,P,O) :- 
        P = rdfs:union, instanceOfSet(X,O).

  in(X,S,P,O) :- 
        P = rdfs:intersection, instanceOfSet(X,S), instanceOfSet(X,O).


Hm, this looks pretty much like what I want to have - what do you think
about it, dear reader?  (Well, it would be nice to be able to
"interchangably" specify the semantics of "new" predicates on top of the
basic rule set - probably with a new "basic" semantic predicate, such as
('new_property', rdf:isInterpretedAs, "rule set") which simply takes the
S and O of (S,'new_property',O)-triples and "moves" it into the "rule
set" specification, for example take the new rules from above and write 

        <ruleset>
         <rule>
          <head>
           <goal>in_range
            <term>$Object</term>
            <term>P</term> 
           </goal>
          <head>
          <body>
           <goal>set_algebraic_range
            <term>P</term>
            <term>A</term>
           </goal>
           <goal>instanceOfSet
            <term>$Object</term>
            <term>A</term>
           </goal>
          </body>
         </rule>
         ...
        </ruleset>

instead of "rule set". Would someone like this?
Received on Sunday, 1 October 2000 13:18:15 UTC