[moved from public-webont-comments] RE: unsupported datatypes from Ian Horrocks on 2003-07-17 (www-rdf-logic@w3.org from July 2003)

From: Ian Horrocks <horrocks@cs.man.ac.uk>
Date: Thu, 17 Jul 2003 14:21:13 +0100
To: "Gary Ng" <Gary.Ng@networkinference.com>
Cc: <www-rdf-logic@w3.org>
Message-ID: <16150.41673.511856.668999@merlin.horrocks.net>
On July 15, Gary Ng writes:
> 
> 
> 
> > -----Original Message-----
> > From: Ian Horrocks
> > Sent: 15 July 2003 18:02
> > To: Gary Ng
> > Cc: public-webont-comments@w3.org
> > Subject: Re: unsupported datatypes
> > 
> > On July 15, Gary Ng writes:
> > >
> > >
> > > Another question, this time about unsupported datatypes.
> > >
> > > In the reference doc, it says:
> > >
> > > "For unsupported datatypes, lexically identical literals should be
> > > considered equal, whereas lexically different literals would not be
> > > known to be either equal or unequal. Unrecognized datatypes should
> be
> > > treated in the same way as unsupported datatypes."
> > >
> > > The first half of the sentence would suggest to treat a literal of
> > > unknown type as just a string. However, I am not entirely sure what
> is
> > > expected from a reasoner with respect to the behaviour of "would not
> be
> > > known to be either equal or unequal".
> > 
> > Unknown or unrecognised datatypes are treated as being the lexical
> > form (a string) of some unknown datatype. It is obviously the case
> > that, whatever the datatype, identical lexical forms map to the same
> > element of the value space, and can thus be considered equal. For
> > non-identical lexical forms, however, it *cannot* be assumed that they
> > do not map to the same element of the value space and are thus
> > unequal.
> > 
> > E.g., the lexical forms "1.0" and "01.00" would map to the same value
> > (and thus be considered equal) in some datatypes (e.g., decimal), but
> > not in others (e.g., string).
> > 
> Yes, I got that. 
> 
> But from a practical point of view of handling values from an
> unsupported datatype within a reasoning tool, this sounds like I can't
> even implement them as strings because since two different strings would
> be considered unequal.

This is true, and seems perfectly reasonable to me - treating
unsupported datatypes as strings means that you are making a strong
and unwarranted assumption about inequality, i.e., that it is exactly
equivalent to string inequality. This obviously isn't true for all
datatypes, and would result in nonmonotonicity of inference as
datatype support is extended. E.g., treating unsupported datatypes as
strings, if weight is a functional datatype property, then an
individual joe with decimal values of "200" and "200.0" for their
weight would be inconsistent if decimal is not supported and
consistent if it is. Using the OWL semantics, joe would be consistent
in both cases.

> So the question is, how should I implement them? 

Treat values of unsupported datatypes similarly to individuals, i.e.,
the same name denotes the same individual, and different names may or
may not denote the same individual (unless explicated with a
DifferentIndividuals axiom).

In a tableaux style reasoner there will typically be a datatype
"oracle" whose job is to determine if a given set of datatypes and
lexical values have an intersection (i.e., that there is at least one
value in LV that interprets all of the lexical values and is an
element of the interpretations of all of the datatypes). For supported
datatypes the oracle can use the relevant theory of
(in)equality. Unsupported datatypes/values would be treated as
arbitrary subsets/elements of LV.


> 
> Consider the following:
> 
> <Measurement rdf:ID="a_measurement">
> 	<hasAValueOf
> rdf:datatype="someUnsupportedType">XYZ</hasAValueOf>
> </Measurement>
> 
> <Measurement rdf:ID="b_measurement">
> 	<hasAValueOf
> rdf:datatype="someUnsupportedType">ABC</hasAValueOf>
> </Measurement>
> 
> by the definition, "XYZ" and "ABC" are neither equal nor unequal.
> So what should be the answer to the following question? 

We are unable to *prove* that XYZ and ABC are either equal or inequal.


> 
> Retrieve all instances of (complementOf(exists hasAValueOf XYZ))
> 
> Because we cannot *prove* that XYZ = or != to ABC, thus 
> The answer would be empty. Am I correct?

Yes.

> 
> If I am correct, then this behaviour is the same as if XYZ and ABC are
> classes/instances. So really we can't implement values from unsupported
> datatypes as strings.
> 
> Correct?

Correct. As I pointed out above, to do so would be to make unwarranted
assumptions that would have undesirable consequences. E.g., in your
example, if we substitute decimal 1 and 1.0 for XYZ and ABC
respectively, and the unsupported datatype is decimal, then treating
the values as strings would result in b_measurement being in the
answer. This is clearly incorrect, and would cease to be the case if
support for decimal were added.

Ian


> 
> G
> 
>
Received on Thursday, 17 July 2003 09:22:17 UTC