RE: SEM: 5.1 literal/data values from Smith, Michael K on 2002-07-19 (www-webont-wg@w3.org from July 2002)

From: Smith, Michael K <michael.smith@eds.com>
Date: Thu, 18 Jul 2002 19:03:52 -0500
To: Ian Horrocks <horrocks@cs.man.ac.uk>
Cc: webont <www-webont-wg@w3.org>
Message-ID: <B8E84F4D9F65D411803500508BE322140F36EC25@USPLM207>
I got lost somewhere in this discussion of literal/datavalues and
UnambiguousProperty (or InverseFunctionalProperty).  Its pretty clear
why we can't have symmetric or transitive Datatype properties.  This is
more a type restriction than anything else (properties can only have
resources in their domains). But I am less clear on InverseFunctional.

Re Ian's example:

> Unfortunately, the interaction of UnambiguousProperty and datatypes
> makes this problematical. Imagine, for example, that a datatype
> consisting of integers in the range 0-999 is used as a unique-id/key
> for instances of the class Person such that all persons have exactly
> one unique-id, all unique-ids are integers in the range 0-999, and
> unique-id is an UnambiguousProperty.  In order to function correctly,
> a reasoner is now required to understand the properties of datatypes,
> e.g., that the cardinality of this particular datatype is 1,000, and
> that as a result no model can contain more than 1,000 instances of
> Person (note that this would not be the case if unique-ids were reals
> in the range 0-999).

I'm not sure what is meant by "function correctly".  Soundly?  

How is this different from the following (which is permitted):

Imagine, for example, that a datatype consisting of integers in the
range 0-0 is used as a idNumber for instances of the class Person
such that all persons have exactly one idNumber (idNumber is
Functional) and all idNumbers are integers in the range 0-0.  In
order to function correctly, a reasoner is now required to understand
the properties of datatypes, e.g., that the cardinality of this
particular datatype is 1, and that as a result all persons have the
same idNumber.  In particular, asserting that the idNumber of 
Mike is different from the idNumber of Bob is a contradiction.

Are these two examples different?  What am I missing?  Is there 
some unsound conclusion that arises in Ian's case and not in the 
one above?

I do not expect OWL systems to be able to reason about integers, XML
Schema dataype cardinality, the length of strings, and many other XML
Schema Datatype properties.  

In fact, I am kind of surprised that it appears that OWL reasoners are
going to need to be able to recognize equality of datatype values (?).
E.g. recognize that _a: and _b: below are equal (using one version of
the proposed RDF syntax for datatype literals)

      _:a  <xsdr:decimal> "10" .
      _:b  <xsdr:integer> "10" .

I had earlier assumed that at best we would treat datatype value equality
according to some trivial surface syntax canonicalization. (Which I
guess could be made to work for this example.)

- Mike

-----Original Message-----
From: Ian Horrocks [mailto:horrocks@cs.man.ac.uk]
Sent: Thursday, July 18, 2002 9:37 AM
To: Jeremy Carroll
Cc: webont
Subject: Re: SEM: 5.1 literal/data values was RE: ADMIN: Minutes of
telecon 11th July 2002


On July 16, Jeremy Carroll writes:
> 
> 
> I was and remain underwhelmed by the discussion.
> 
> Ian:
> > Can I point out that I already did this once! See the thread beginning
> > with:
> >
> > http://lists.w3.org/Archives/Public/www-webont-wg/2002Mar/0241.html
> >
> > and in particular:
> >
> > http://lists.w3.org/Archives/Public/www-webont-wg/2002Apr/0354.html
> > http://lists.w3.org/Archives/Public/www-webont-wg/2002May/0044.html
> >
> 
> 
> To recap ...
> 
> Ian:
> > > Unfortunately, the interaction of UnambiguousProperty and datatypes
> > > makes this problematical. Imagine, for example, that a datatype
> > > consisting of integers in the range 0-999 is used as a unique-id/key
> > > for instances of the class Person such that all persons have exactly
> > > one unique-id, all unique-ids are integers in the range 0-999, and
> > > unique-id is an UnambiguousProperty.  In order to function correctly,
> > > a reasoner is now required to understand the properties of datatypes,
> > > e.g., that the cardinality of this particular datatype is 1,000, and
> > > that as a result no model can contain more than 1,000 instances of
> > > Person (note that this would not be the case if unique-ids were reals
> > > in the range 0-999).
> >
> Jeremy:
> > Can't I set up this example without any recourse to datatypes at all.
> >
> > e.g.
> >
> > I form one thousand URIs.
> > For each one I form the class: oneOf[x] (giving me 1000 classes).
> >
> > I form the disjoint union of these 1000 classes, giving me oneOf 1000
> > distinct URIs.
> >
> Ian:
> > You are missing the crucial point that the reasoner has to know about
> > the properties of the datatype (i.e., that there are exactly 1,000
> > integers in the range 0-999, neither more nor less) and may also need
> > to consider complex interactions resulting from the predicates defined
> > on the datatypes. This is what makes the formal properties unclear and
> > could lead to decidability problems.
> 
> 
> To give an example of a problematic datatype, in XML Schema we could
construct a
> datatype consisting of all two character strings. Then we could try naming
a few
> billion individuals with this set.
> 
> This datatype is finite. Unfortunately the exact number interacts
unfortunately
> with Unicode definition, and with issues such as whether or not unicode
strings
> that are not in normal form C are permitted. The definition of Unicode
changes
> over time.
> 
> So, Ian is right to indicate that permitting this would mean that creating
a
> complete reasoner would become an unreasonable task.
> 
> However, there is a tail and there is a dog.
> 
> For me the dog is reasoning about the things people really want to say,
and the
> tail is the logical and computational properties analysis that Ian and
Peter are
> so good at. I would be quite happy to permit unambiguous datatype
properties and
> indicate that implementations are not expected to resolve the cardinality
> issues.

Of course if the dog is "things people really want to say", then we
would really need a MUCH more powerful language than OWL. I assume
that our work is based on the notion of finding a language that is
powerful enough to say some/most of what some/most people want to say
some/most of the time, and yet is simple enough to be in some sense
"implementable". Given that our high level objective is to support the
development of "applications that depend on an understanding of
logical content", I take it to be important that applications will be
able to understand OWL, i.e., that we will be able to reason with it
(I'm not sure how else to interpret "understanding").

> I am moderately in favour of the rigour that the DL community bring to our
> discussion. However, I would like to see the concern about completeness
being,
> we must correctly document where an OWL reasoner is not expected to be
complete,
> as opposed to, OWL must be implementable in a sound and complete fashion.

Unfortunately, it is notoriously difficult to build incomplete
reasoners where the exact nature of the incompleteness can be
precisely characterised/documented. This would be doubly true in this
case where we don't know how to build a complete reasoner, so we can't
characterise the incompleteness w.r.t. a complete algorithm, nor can
we build a reference implementation against which others can be
tested and compared.

Ian



> 
> Jeremy
> 
> 
>
Received on Thursday, 18 July 2002 20:04:00 UTC