Re: Cardinality in the open world

Hi,

Thanks for the comments.  I'll try to clarify, and perhaps change the 
way I presented the question.

On 06/04/2005, at 3:41 PM, Bijan Parsia wrote:
>> I've been having some difficulty understanding the use of OWL 
>> cardinality with the open world assumption, and I'd like some advice 
>> please.
>>
>> While I know that the open world assumption means that any 
>> unspecified statements are "unknown".
>
> What do you mean by "unspecified"?

I'm thinking in terms of a set of RDF statements, such as those found 
in an RDF database.  An "unspecified" statement is one that does not 
appear in the database.

>>  I interpret this to mean that it is possible for any unwritten 
>> statement
>
> But there are known unwritten statements. Most entailments, for 
> example.

My ultimate purpose here is to try to work out which are valid 
entailments.  I am also trying to recognise when a set of statements 
contains a contradiction.

>>   (If I'm wrong here, then let me know as the rest of this message is 
>> based on this assumption).
>
> This doesn't seem to be the most helpful way to conceive of it. Better 
> to think in terms of model theory (though, it's possible that your 
> version will line up with the model theoretic one...it's just nicer to 
> use standard terminology).

Unfortunately I don't have a background in model theory.  Can you 
suggest a primer for me?

I'll try and rephrase in terms of "unknown" conditions:

minCardinality of 0:
There can be no contradictions with this statement (maxCardinality 
cannot be < 0)

minCardinality of 1:
Describes existence.  Any statements including this predicate are 
consistent (contain no contradictions).

>> minCardinality of 1:
>> This describes existence.  Any statements with this predicate make 
>> the model valid.
>
> Not if there were a maxCardinailty of 0.

True.  However, I would distinguish that as an inconsistency in the 
TBox data.  While these inconsistencies are important to find, I want 
to assume that the TBox information is correct, and I am looking for 
inconsistencies in the ABox data in relation to the TBox data.

In a practical context, I'd like to check an ontology for 
contradictions upon entry to the database.  I don't think it would make 
sense to check for consistency and perform entailments with an 
inconsistent ontology.  I take your point though.  Please excuse my 
assumptions.  :-)

Re: minCardinality 1
>>   However, if there are no statements with the predicate then the 
>> model is still consistent, as those statements could exist.

Reword this to: If there are no statement with the predicate then the 
consistency of this is "unknown".  This is because the database does 
not contain any statements to make this restriction consistent, so the 
existence of these statements is "unknown".

Consequently, this cannot be called an error (due to inconsistency) 
because the consistency cannot be determined.


>> minCardinality > 1:
>> If there are not enough statements using the predicate, then the 
>> model will still be consistent because those statements could exist.

Reword to: If there are not enough known statements using the 
predicate, then consistency is "unknown".  Therefore no error is found.

Incidentally, you can probably tell from my working here, but I'm 
making the assumption that the above statement is in relation to a 
single subject.  All of the tests I describe here need to be performed 
over each subject-predicate in turn.

> Not necessarily. For example, and still trivial, say you had a 
> minCard= 4 and a maxCard = 2. Oops.

Yes, though this is the TBox error I mentioned above.

>> The only case where this could not be consistent would be if it is 
>> not legal to create the required statements.  The only instance of 
>> this that I can think of is if the range of the predicate is 
>> restricted in some way, for instance it could be a oneOf without 
>> enough members.
>
> See above.

Yes, this is another TBox error.  Since I was going to pick up on this 
one, I should have also mentioned the minCardinality > maxCardinality 
error as well, so I see why you are pointing it out.

>>   However, that case would be a fault in the ontology, not in the 
>> data.
>
> I don't see what this means and the degree that I do, I don't see that 
> it matters.

There is a difference between data contradicting its ontology and the 
ontology contradicting itself.  When the ontology contradicts itself 
then it is impossible to insert data into a database that will work 
with the ontology.  When an ontology is consistent, then it becomes 
useful, and it can be used on data which is inserted into the database. 
  In this case "used" means to check for consistency, and to perform 
entailments.

This has a practical application, as ontology data *can* be considered 
separately inside a database (though this is not required, since OWL is 
expressed in RDF).  Often an ontology will be written once, and used 
many times on different sets of data.  In this case, the ontology 
should be recognised to be inconsistent as soon as it is presented to 
the database, rather than when it is applied to a set of data.

>> For validity, it may seem easy to conform if there are enough 
>> statements with the predicate.
>
> What? Standard terminology will not only help your communication (with 
> me at least :)) but I suspect your understanding.

Going back to my other email... validity means that a system evaluates 
to "true" under every interpretation.

To reword this: It may appear that there are no unknown statements 
which could make the system inconsistent.  The system therefore 
*appears* consistent.

>> However, if any objects from these statements use owl:sameAs to 
>> declare that they are the same, the real usage of this predicate will 
>> be reduced, making the model invalid.
>
> Stiull lost.

Reword: It is usually unknown if there are statements which include the 
owl:sameAs predicate.  This means that any of the objects from declared 
statements could in fact be the same object (this is unknown).  The 
effective number of times that the predicate is in use for the subject 
would be reduced in this case.  Consequently, what appeared to be 
consistent is instead unknown.

As an example:
<owl:Class rdf:ID="MyClass"/>
   <owl:Restriction>
     <owl:onProperty rdf:resource="#myPredicate" />
     <owl:minCardinality 
rdf:datatype="&xsd;Integer">2</owl:minCardinality>
   </owl:Restriction>
</owl:Class>

<ns:MyClass rdf:ID="subject">
   <ns:myPredicate rdf:resource="#object_1"/>
   <ns:myPredicate rdf:resource="#object_2"/>
</ns:MyClass>

This looks consistent at face value.  However, it is unknown if the 
following holds:
<owl:Thing rdf:ID="object_1">
   <owl:sameAs rdf:resource="#object_2">
</owl:Thing>

This then means that the consistency of the usage of "myPredicate" on 
"subject" is unknown.

>> The only way validity can be guaranteed is if enough of the objects 
>> are declared to be different from the others, via owl:differentFrom 
>> or owl:allDifferent.
>
> There could be unknown distinct individuals. Part of what a reasoner 
> does is consider such models where the individuals that are members of 
> the restriction are related by the property to distinct individuals.

What I meant, is that the only way to guarantee consistency is for all 
of the objects to be guaranteed to be unique.  The only way to do that 
is if they are described to be different from each other.  That way we 
know that the owl:sameAs statement above does not exist.

>> So for all 3 cases, the model is always consistent.

So now I'm saying that for all 3 cases the consistency is always true 
or unknown.  Given a consistent ontology, it is impossible to detect 
inconsistency.  Therefore, what is owl:minCardinality giving us?  We 
cannot use it to detect a problem in the data (I'm referring to ABox 
data only).

> If *all* you say is that minCard = something, then sure. But so? if 
> *all* I say is that something is B or Q, so too. An underconstrained 
> ontology is easy to make consistent.

Can you give examples of a set of constraints that will make 
minCardinality useful?

(I'm not saying that it is useless, but I can't see the use at all).

>> Validity is guaranteed for cardinality of 0, possible with 
>> cardinality of 1, and difficult for cardinality of more than 1.
>
> I don't know what you mean by "validity".

A guarantee of no conflicts, no matter what statements are introduced.  
But I've backed away from that terminology.  :-)

>> owl:maxCardinality is similar:
>>
>> maxCardinality of 0:
>> If the predicate is not used, then there is an interpretation
>
> Ah...interpretation. That's sounding nicer to me :)
>
>> where the model is consistent.  However, since there may exist 
>> statements which use the predicate, then the model can't be valid.

OK, this is wrong.  I should be able to assume in the open world that 
there will be no other statements to conflict with the maximum 
cardinality constraint.  So if the database contains statements with 
this predicate then the system is inconsistent, otherwise it is 
consistent.  This is useful.

> Or I might have a minCardinatily of 45. Or a someValuesFrom.

Again, this sort of thing is an inconsistent ontology.  It's something 
that needs to be detected, but is not what I'm looking for here.

>> maxCardinality >= 1:
>> If the predicate is used fewer times than the maxCardinality, then 
>> this is consistent.
>
> Not necessarily. Consider a min conflict. Or consider the case where 
> you have maxCard = 1 and someValuesFrom A, and someValuesFrom 
> complementOf(A)

Well the ABox data is definitely consistent here.  Of course, there are 
several constructs where the TBox data can be inconsistent.

I should have recognised that owl:maxCardinality = 1 has a different 
usage than consistency.  In this case it can be used to entail that 
objects are the same (unless they are already specified as different, 
in which case the inconsistency can be found).

>>   However, there may be more statements, except when the range is 
>> restricted (eg. with owl:oneOf), which means that validity can rarely 
>> be proven.
>
> Validity? I still don't know what you mean.

What I really meant is that the system is guaranteed consistent, 
regardless of open world unknowns.

The rest of my maxCardinality questions were based on the erroneous 
supposition that "unknown" statements from the open world could cause 
an inconsistency (this was when I was talking about validity).  I've 
now realised that this can't happen, by definition.  So I withdraw my 
concern that maxCardinality could be made inconsistent when it starts 
out looking consistent.

However, it still appears that the only way to detect an ABox 
inconsistency with respect to maxCardinality >= 1 is when each object 
is different to each other object (declared with owl:differentFrom or 
owl:allDifferent).  This seems a little onerous.  I say that, because 
most systems will not go to this trouble (at least, not any that I've 
seen).  In the case where the objects have not been differentiated then 
it is not possible to check for inconsistency.

>> As for consistency, minCardinality is *always* consistent.
>
> For most readings, false.

I can accept that.  Let me say instead: As for consistency, 
minCardinality is *never* inconsistent.

(Remember, I'm presuming a consistent ontology.  I'm not talking about 
TBox conflicts, such as maxCardinality < minCardinalitiy, or a range of 
a class with fewer members than the minCardinality)

>>  maxCardinality is almost always consistent as well (the model needs 
>> to go to a lot of with owl:differentFrom to be inconsistent).
>
> Not true.

OK, instead let me say: maxCardinality is almost never inconsistent.  
Inconsistency will only occur after a lot of owl:differentFrom 
statements make enough objects unique.

>> If my interpretation here is correct, then these cardinality 
>> constraints would not appear to be be as useful as they seem.
>
> Well, cardinailty in OWL isn't like cardinality in XML Schema or in a 
> database or in a OOP. The open world assumption can be surprising.

This brings me back to the real world, and helps explain why to 
consider ABox consistency w.r.t. TBox information, as a separate 
function to TBox consistency.  ABox consistency is all about meeting 
the correct number of statements.  When users include these statements 
they are interested in making sure that the system has "at least" a 
certain number of statements, or "at most" some number.  Users would 
like to make sure that these numbers are being met.

What I've been trying to show, is that minCardinality statements cannot 
be used for this function.

maxCardinality is different, as it CAN be used for this function... but 
ONLY if the user has specified that each of the objects being used with 
a given subject-predicate are distinct.  It also has the useful 
property of providing entailment when the max cardinality is 1.

>>   It looks very much like these constraints were designed for a 
>> closed world assumption,
>
> Not really. It's just more common for cardinality constraints to be 
> used for "data validation", hence the surprise when encountering OWL.

So what you're saying is that OWL does not allow cardinality to be used 
for data validation?

Since minCardinality does not provide entailment, nor does it provide 
consistency checking, then what function does it serve?

>> not the open world.
>
> I fail to see how you couldn't run this in general. If you want or 
> expect closed world like reasoning from OWL, you will be disappointed 
> or surprised or both.

I've been told that other systems have gone with a "limited closed 
world reasoning".  The reason for this is to do the intuitive operation 
of counting the predicate usage and comparing to the ontology.

> Number restrictions have a long and illustrious history in description 
> logics. I think its fair to conclude that at least some people like 
> them the way they are (i.e., open world). They are definitely 
> surprising to a lot of people (of course, so is not having the unique 
> name assumption!!!), and they are just wrong in many cases (e.g., data 
> validation of various sorts). But nothing is particular to 
> cardinalilty here.

Well this is certainly consistent with what I've seen.  I'm just 
wondering if things really are what I perceive them to be.  I'm 
particularly concerned about owl:minCardinality, as it does not seem to 
have a practical application.

>> Can someone enlighten me here please?  TIA.
>
> I shall wait upon your reply to determine if I am he who can so 
> enlighten :)

Your emails have been useful so far.  I notice that I have another one, 
so I'll go and read it now...

Regards,
Paul Gearon

Received on Thursday, 7 April 2005 06:28:21 UTC