Re: Cardinality in the open world

Hi again,

I seem to be gradually converging on what you've been explaining.  You 
may not have picked up on it, but some of my past emails have started 
out with one point of view and ended on another as I've understood the 
flaws in what I'm trying to explain.

Anyway, on to the adversarial part of the email....   ;-)

On 10/04/2005, at 2:02 PM, Bijan Parsia wrote:
>> Actually, I'm not considering OWL DL at all.  At least, I'm trying 
>> not to be.  In places where there is a difference I'll be trying to 
>> discuss things in terms of OWL Full.
>
> But OWL Full only introduces irrelevant complications, for the 
> purposes of current discussion. For example, it really doesn't seem to 
> affect anything that OWL Full class expression syntax also has 
> assertional force. But it seems to have confused you.

This is frustrating.

I always understood that the usage of assertions in the syntax was not 
relevant to that part of the discussion.  I wasn't going to mention it 
all.  However, at the time I was trying to say that assertions did not 
appear in the expression.  The problem with that statement was that the 
syntax *did* use an assertion.  Since I didn't consider this assertion 
relevant (being for syntax) I decided to mention that I was aware of 
it, but that it was irrelevant to my argument.

I thought I'd mention that I was aware of it, so that I couldn't be 
accused of claiming that there was no assertion, when the syntax used 
one.  The fact that you think that I believed this assertion to be 
relevant in any way demonstrates my lack of clarity, and I apologise.

> [snip]
>>>> Since restrictions are used to describe the class,
>>>
>>> What class?
>>
>> The class that the restriction applies to
>
> The restriciton doesn't *apply to* a class. It *is* a class.

Yes, you're right, and I've even written magic-set code that works like 
this.

I guess I got temporarily confused, possibly I check that the 
restriction "applies to" the individual, and if it does then the 
individual meets the OWL DL "description".  (Is there a more correct 
word than "meets"?)

> More precisely, it's a formula with one free variable. To make an Abox 
> assertion, you replace that variable with a logical constant (i.e., a 
> name). To make a Tbox axiom, you put it on one side of a conditional, 
> where the other side has a formula with the same free variable, and 
> universally quantify the whole conditional on that variable.

Ah, this helps to explain for me how cardinality can be ABox, when I 
thought it was TBox only.  I think I see now.

>> .  As in: http://www.w3.org/TR/owl-ref/#Restriction
>> "It describes an anonymous class, namely a class of all individuals 
>> that satisfy the restriction"
>
> In general, I avoid expository text and stick to normative 
> specifications.

Yes, but for someone like me (who has been learning logic, kicking an 
screaming the whole way), the expository text is a godsend.  
Unfortunately, it is full of pitfalls, as interpretations like this are 
often incomplete in the details, and occasionally even incorrect.  But 
they still help.

>  But, still, clearly this defines the restriction *as* a class. It 
> applies *to individuals*.

Yup.  Mea culpa.  I knew better.

> [snip]
>>> So an individual can be of type restirction without an intervening 
>>> class definition.
>>
>> Sorry, I presumed that individuals could not have an anonymous class 
>> type.
>
> Even if they didn't, if you set up a definition (i.e., a single 
> definition with an atomic class on one side of a biconditional), the 
> effect would be the same.
>
> There's not a big logically difference. There's a practical difference 
> in that you can separate out the sentences (thereby changing the 
> meaning of the type assertion).

This makes me think of something else I've been doing...

I already mentioned that I've used intrinsic statements to track down 
descriptions for individuals.  To do this I used an predicate in my own 
namespace so it wouldn't look like rdf:type.  I didn't want to use 
rdf:type because descriptions are not quite classes, and I didn't want 
erroneous statements showing up when an individual's type was queried.

Looking at this discussion I'm starting to think that I should just use 
rdf:type for restrictions, since they *are* classes.

I also think that I should still use the internal predicate for union, 
intersection, complementOf and oneOf, since these are not classes.  
(This is based heavily on the expository text, with a little reference 
to the spec.  Since you can read the spec better than I, I thought it 
worth asking)

>> <snip/>
>>> Thus class expressions can be part of abox assertions.
>>>
>>> Class expressions are also part of tbox expressions. The tbox 
>>> consist of definitions and constraints. These are all universally 
>>> quantified conditionals.
>>
>> OK, I'm going to have to take some time to read up and learn exactly 
>> where the boundaries lie, so I won't go into it now.
>
> Well, for what it's worth, I've just told you exactly where the 
> boundaries lie :)

My concern was how you've said that class expressions can be part of 
ABox assertions, and that they are also a part of TBox expressions.  So 
when are they one, and when are they the other?

I understand that saying that something is an instance of a class is 
ABox.  Unfortunately, I thought that the definition of that class was 
TBox... though now I suppose I see that this just defines a structure, 
with no meaning to any of it (for instance, there's no way to know how 
any parts of that structure interacts with each other).  In that 
context I think I can see that cardinality simply becomes a part of the 
structural definition, and so in that context it would be ABox.

Is this right?

(This is one of those times when I start trying to respond with one 
idea, and end up elsewhere by virtue of trying to put my thoughts down 
in words.)   :-)

> More importantly, the boundaries are insignificant. It's not like a 
> C++ class definition or a database schema.

I didn't think of the boundaries as being terribly significant.  
Particularly since I could see that some of OWL was ABox, and yet I 
thought that the ontology was supposed to be purely TBox, so I figured 
the line between the two was getting a little blurred.  I brought them 
into the conversation to try and differentiate between things I wanted 
to look at and things that I didn't.  We ended up discussing what ABox 
and TBox meant, which was really a digression.  (as were several other 
things we've talked about)

However, I'm happy that we digressed like this, as it's pulled me up on 
several things I had wrong, or just hadn't thought about properly.  I'm 
still new to all of this (I'd never looked at logic until about 6 
months ago, and I hadn't accomplished much until the last 2 months, 
which might explain my apparent ignorance).

>>>>  but I was under the impression that the assertion of this property 
>>>> was just the RDF way of expressing a higher order concept.
>>>
>>> Er....No. OWL is first order all the way. Even owl full.
>>
>> Oh?  I thought it was 1.5?
>
> I don't know what you are talking about. It is certainly the case that 
> all variants of OWL are strictly first order. OWL Full has some second 
> order syntax, but the semantics are first order.

1.5 order logic was described to me as containing some, but not all, 
second order logic constructs.  So if the logic contains elements that 
are not first order, then it's 1.5.  I thought OWL Full met this 
definition.

I'm surprised that you say that OWL Full's semantics are all first 
order.  I thought that the ability to treat predicates (properties) as 
full classes moved it out of being purely first order?  Are you saying 
that there are limits to what this allows which prevent non-first order 
statements from being written?

>>>> Am I wrong in thinking that it is possible to express TBox 
>>>> information in RDF?
>>>
>>> Well, in OWL DL, the tbox *and* class expressions are *encoded* in 
>>> RDF (I wouldn't say they are *expressed*).
>>
>> This show up some of the problems we've had.
>
> Perhaps not as you expect.
>
>>  You obviously have a different semantic for these words in this 
>> context, whereas I have the same.
>
> Thus you conceal a distinction I think important. Instead of trying to 
> understand my distinction, you seize upon your lack of such a 
> distinction as evidence that I misunderstand you.

I may well be missing an important distinction.  However, what I 
*meant* was encoded, and I'd hoped that my context would elucidate this 
(you seemed to have worked it out, since you suggested the correct 
word).  Since I had no other meaning here, your concern at my missing 
the distinction would irrelevant to this particular statement.

You worked out what I was trying to say, and yet seem to think that my 
ignorance of another meaning to my chosen word indicates that I didn't 
know what I was trying to say.  You seem to be conflating my ignorance 
of what "expressed" could mean in this context with ignorance of what I 
was talking about.  That is why I believe there is a misunderstanding.

While we're here... what meaning do you attribute to "expressed" that 
means more than "encoded" in the above context?  As I said, I meant 
"encoded", so I'm curious on what you picked up on.

(BTW, I'd seen that word "conflate" before, but never looked it up 
until now.  It's perfectly apt.  Thank you.)   :-)

>>   I expect you've often seen me say the "wrong thing" when I've meant 
>> something different.
>
> I expect that the something different is also wrong. Why is this a 
> problem?

As I said, I meant "encoded", and you said "encoded".  If what I meant 
is wrong, then what you said is wrong.  :-)

In case you're also referring to me only talking about encoding TBox in 
RDF (whereas you said "tbox *and* class expressions") - my statement 
was not trying to deny any other type of expression, but rather because 
it seemed (at the time) that you were saying that a lot of things I 
took to be TBox were in fact ABox.  I was starting to wonder if you 
were saying there was no TBox at all, which was at odds with what I 
thought, hence the question was limited to TBox.

But the distinction being argued was about "expressed" vs. "encoded", 
in which case I don't think my meaning was wrong... just unclear 
(through the use of the wrong word).  See my frustration?

Anyway, as you pointed out, this is orthogonal....

>> I've persisted with the expression "useful" in the context of the 
>> tools I've hoped to use.  Many other OWL expressions have been more 
>> evident in how to go about using them.
>
> That seems generally, but not entirely, true. It's unclear if people 
> *do* know how to go about using other OWL expressions. The combination 
> of open world assumption and lack of unique name assumptions means you 
> have to say a lot more negative stuff to get various entailments 
> (including contradictions) out of OWL.

It was exactly this problem which brought me here in the first place.

I was told that OWL makes an open world assumption.  I then worked out 
that there is no unique name assumption, by looking at sameAs and 
differentFrom (I've since seen people mention this).  However, like 
many people I didn't think much of it.

I worked on several OWL constructs with no problems, but when I looked 
at cardinality I realised there was a problem with the meaning I'd been 
given.  At first I thought it was me, then I thought the spec had a 
problem.  You've made me see that the spec is perfectly sound, and that 
I'd recognised the right thing (I just hadn't thought it through), and 
now I realise that it is almost every one else I know who has the 
problem with cardinality.  :-)

>> I suppose I came into this confused, because there are a lot of 
>> people who expect that cardinality support (of some level) will 
>> involve comparing the cardinality values to the number of times a 
>> predicate gets used (for instances of the classes described).
>
> If by this you mean that cardinality does not *validate* the data in 
> your KB, and people expect that, you are correct. It's one of the 
> tough things to get past.

That has been the problem, yes.  Now I have to convince others of this 
as well.  :-(

>  The K or epistemic operator can help (as Jos acknowledged in his OWL 
> Flight paper). With the K operator, you can define the class of people 
> who have three *known* children. For something to be a member of that 
> class, there must be three distinct individuals in the kb in the has 
> child relation to that something. (roughly). We have a preliminary 
> implementation of the K operator for ALC (so no counting quantifiers 
> yet) in Pellet.

Thanks.  I'll try looking at this.

> [snip]
>> I'm sorry.  Reading it, I see that I was not clear.
>
> That's ok. If you understand this point and aren't backsliding, I'm 
> fine :)

I don't THINK that I am.  OTOH, my ideas have been modified by 
persistent inculcation here, so maybe I was being stupid earlier and 
can't remember it.  ;-)

> I'm pretty sure you are taking on the very tough challenge of trying 
> to combine OWL reasoning with large ABoxen held in secondary storage 
> based database.

You've been speaking to David Wood?  :-)

Yes, I'm trying to do this.  Fortunately, I don't HAVE to be a logic 
expert for the framework I'm building, but I need to know something and 
the more I know the better.  I need a decent set of rules, and I need 
to be able to map them into the operations available on the datastore.  
This is why I've been tracking down papers which have OWL entailments 
(and some consistency expressions) in Horn clauses and FOL.  Using 
these I can move ahead without having worked out any formulae for 
myself.  I already have the simple stuff going very well.

However, there seem to be a lot of valid entailments and consistency 
checks that are not described in the papers I've found.  This has often 
been due to an inability of the language (particularly Horn clauses).  
Since the query language I'm using has no such language restrictions 
(and they can be removed where they do exist), I've been trying to find 
other OWL entailments and consistency checks which are not expressed 
elsewhere.  That's what has led me here.

Ideally, someone else would do it for me.  But then again, what would I 
learn if they did?  :-)

> The current work in this area (that I know of) is DLDB from Leigh 
> (which I suspect is incomplete, but don't know), and InstanceStore 
> from U. of Manchester (which, currently, is limited to role-free 
> aboxes).

Thanks.  I'd heard of InstanceStore (though not looked into it yet), 
but not DLDB.

> Yes. This is precisely the problem Jos and I mentioned earlier; me as 
> an education problem, jos more as a reason to adopt something else.
>
> There is a bit of a dogma that the Web *requires* the open world 
> assumption because of the "open nature of the web". I think that 
> particular argument is ubercrap, though I've heard more sophisticated 
> ones. However, I think having the open world assumption is useful and 
> fun in many circumstances.

You both have a point.  minCardinality is exactly right as it is, so 
education would seem the way to go.  Only there are so many people who 
don't see it (starting with myself) that you'd have to start wondering 
if the right approach has been taken in the first place.  Maybe there 
is a better way.

The problem with adopting something else, is that I can't see what it 
might be.  One problem with the understanding of minimum cardinality 
comes about because of the open world assumption, and non-unique names. 
  However, I think both are reasonable things to have.

BTW, the "open nature of the web" *does* sound like a buzzword phrase.  
However, people are forever putting more info onto the web (ideally it 
would be consistent with what's already there, but I'll acknowledge 
that this is a "pie in the sky" goal), so the open world assumption 
makes a lot of sense to me.

The other view is that at any one point in time, the web really is 
limited.  It would be easy to call that our "closed world".  From the 
point of view of a datastore, things get a LOT easier!

>>>> I started out with an expectation that I could use minCardinality 
>>>> for consistency checking using some kind of counting mechanism.  
>>>> I've now come to understand that this was wrong.
>>>
>>> You can, but not like you thought. This is in fact how reasoners 
>>> work (for some value of "some kind of counting mechanism).
>>
>> This is interesting.  Can you give an example please?  The only place 
>> where I can think this could happen would be if the range of a 
>> cardinality restricted predicate were finite and countable (eg. 
>> oneOf).  Is this the sort of thing you mean, or have I missed the 
>> point again?

<snip_example/>

> Whoops! We have too many ps for x! So we try to *merge* anon1 and 
> anon2. But any way we do it, we'll end up with a clash. So no dice.

I followed all your reasoning (that was all in the stuff I snipped), 
and it makes sense for a lot of constructs, only I don't really see 
where min cardinality could enter that picture.  In this example you 
broke (technical word there, I know) counting max cardinality by 
demonstrating there were more unique statements than specified.

I believe that the only way to break counting on min cardinality is if 
you can demonstrate there can be no more statements with the predicate. 
  You said, "When you have mins you have to generate the appropriate 
number of successors", but do you have to generate them?  I thought 
that the open world would usually protect you from this?

I said "usually" because restricted classes such as classes of "oneOf" 
can create a limit to generating statements, and I think I've 
specifically mentioned this before.

I'm not saying that you're wrong, but I don't really see counting for 
minCardinality working with this kind of example.

> Of course, practical reasoners try to detect such things before 
> generating too many nodes (or nondeterministically merging them). If 
> you see a min and a max on the same node where the max is less than 
> the min...well, clash there partner!

If only it were all that easy.

>> That's a very leading comment,
>
> I think you mean "dismissive" :)

Both.  It led me to a brick wall.  :-)

> Well, my favorite is that ParentsWithAtLeastThreeChildren is entailed 
> to be a subclass of ParentsWithAtLeastTwoChildren (defined in the 
> obvious way)..

Good example.  Thanks.

<snip/>
> Oh, I see where we've parted here. I'm just warning you that there are 
> other places where the open world assumption might clash with many 
> people's expectations.
>
> Disjunctions are one (something that is A or B isn't necessarily known 
> to be A or known to be B).
>
> that the members of an owl oneOf aren't all distinct is another.
>
> People often expect that differently named things are distinct by 
> default, unless you asset they are the same.

Well at least I've picked up on these already.  Maybe I should compile 
a list.

> Lists...oy, lists. There's so much badness there! :)

At so many levels.  Forgetting logic for a moment, implementing these 
things in datastores is just painful.  It's like they didn't *want* 
them to be easy to use or efficient.  :-)

<snip/>
> Raphael, Ben, Ian, and others have workd on Description Logic 
> Programs, the datalog friendly subset of OWL (Jos's paper on OWL 
> Flight builds on this work). That's a very tiny bit of OWL.

I know.  If it were just this, then I'd be nearly done by now.  
Actually, now that I think about it.... I *would* be done by now.  
Datalog has been relatively simple to map to the query language.  FOL 
is harder, and I don't have a complete set of formulae to work with 
yet, so I know there are difficulties I've yet to experience.

> Remember that datalog's decision procedure is polynominal whereas OWL 
> DL's is NEXPTIME and OWL Full is doesn't *have* a decision procedure 
> (i.e., it's only semi-decidable).

True, but languages like Java can create an infinite loop, yet it's 
still useful.  In other words, just because it's possible to create 
something that is very inefficient (or impossible) to resolve in the 
system, it doesn't mean that the system isn't worth building.  I let 
Gödel be my guide here.  :-)

(OK, my supervisor had a hand in this as well)

> You ain't going to shoehorn the latter into the former without a 
> trim...with a sledgehammer :)

Wouldn't an axe be better?  ;-)

> Er...Mr. Term Pedant time again. Sets of sentences are usually what's 
> called "consistent". You mean "sensible"?

Yup.

>> but don't appear to be deductive.  That bothered me until I learnt 
>> about abduction.
>
> Well, hmm. Entailment usually *is* a deductive notion (and semantic, 
> i.e., P entails C iff C is true in all interpretations P is true). 
> But, be that as it may.

Maybe I had it wrong.  It's been a few months since I last looked.  I 
just remember looking at conclusions which matched the premises, but 
which contained new entities that seemed to come from nowhere.

I went back looking for one, and ended up at one of the entailment 
tests:
http://www.w3.org/TR/owl-test/byFunction#Class-006

Now that I'm more used to OWL I'm happier with this... though I never 
thought I'd be entailing so many statements just from declaring a 
Thing.  Could get expensive.

I can actually see this one following from the premises.  I thought I 
had seen one that didn't seem to follow, but was consistent.  Perhaps 
it was just due to faulty understanding at the time.

(I don't expect to find errors in these documents, so I'd rather 
believe that I was wrong).

>> Still, I'm looking at deductive systems, so abductive entailments 
>> seem beyond me for the moment.
> [snip]
>
> They are hairy.

I've been given some papers on abductive reasoning.  It seems to me 
that you need a random number generator, and some way of mapping the 
output into a statement, which you then check for consistency.  (OK, 
I'm being cruel to the author of that paper now.  Hope he doesn't read 
this list)  :-)

Regards,
Paul Gearon

Received on Sunday, 10 April 2005 13:59:17 UTC