Re: Clarifying what datatype entailment support means (Re: xsd:float and xsd:decimal) from pat hayes on 2002-12-01 (w3c-rdfcore-wg@w3.org from December 2002)

From: pat hayes <phayes@ai.uwf.edu>
Date: Sun, 1 Dec 2002 00:07:34 -0600
To: "Patrick Stickler" <patrick.stickler@nokia.com>
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p05111b03ba0f4c87b270@[10.0.100.247]>
>[Patrick Stickler, Nokia/Finland, (+358 40) 801 9690, 
>patrick.stickler@nokia.com]
>
>
>----- Original Message -----
>From: "ext Jan Grant" <Jan.Grant@bristol.ac.uk>
>To: "Brian McBride" <bwm@hplb.hpl.hp.com>
>Cc: "Patrick Stickler" <patrick.stickler@nokia.com>; "ext Jeremy 
>Carroll" <jjc@hpl.hp.com>; "w3c-rdfcore-wg" <w3c-rdfcore-wg@w3.org>
>Sent: 28 November, 2002 17:31
>Subject: Re: xsd:float and xsd:decimal
>
>
>>  On Thu, 28 Nov 2002, Brian McBride wrote:
>>
>>  > >I'm fine with this as long as it is clear (somewhere) that
>>  > >datatype entailments involving equality of values between
>>  > >different datatypes are based on the definitions of the
>>  > >datatypes themselves, and if the relationships between
>>  > >the datatypes are not part of the formal definitions of
>>  > >the datatypes, then the entailments cannot be determined.
>>  > >
>>  > >I.e. we need to be clear about the basis for the entailments
>>  > >and not work solely on the basis of human intuition.
>>  >
>>  > Patrick,
>>  >
>>  > May I test my understanding of what you mean here.  I offer two datatype
>>  > definitions and an entailment.
>>
>>  Patrick's concerns come down to this, I think: what _machinery_ is
>>  mandated for determining a RDF-D entailment?
>
>Yes, this is the crux of my concerns.
>
>>  Do we say, we expect
>>  reasoners to only copy when given subClassOf relationships between
>>  datatypes?
>>
>>  That is a limited form of reasoning and would not catch other
>>  entailments such as the one you give in your example.
>>
>>  I think the definition I proposed for the DT entailment test cases, that
>>
>>  "supporting datatypes X, Y, Z, ..."
>>
>>  means
>>
>>  "For two DT literals, each with a type from {X, Y, Z, ...},
>>  you can determine whether they denote the same value"
>>
>>  (determining the value itself is not required)
>>
>>  is a reasonable one.
>
>OK, if that is explicitly defined for all of the datatype entailment
>tests, then I think that is OK insofar as the tests are concerned.
>
>Though there still is the issue of implied expectations from applications
>that provide some form of datatyping support, and I'm concerned that the
>above definition will be percieved as the expected minimal level of
>datatyping support for all RDF applications that "do" datatyping, which
>IMO is not a reasonable *minimal* level of expected DT support.
>
>I see three meaningful levels of datatyping support:
>
>Level 1: Minimal (each datatype dealt with in isolation)
>
>    For two DT literals of the same DT, where the DT is supported,
>    it can be determined whether they denote the same value, and if
>    the DT is ordered, what their ordering relationship is.

What RDF entailments are affected by ordering information? See below 
for a suggestion.

>
>Level 2: Segmented (relations between some datatypes supported)
>
>    For two DT literals of different datatypes where both datatypes
>    are supported and their is a known intersection of the value
>    spaces of the two datatypes, it can be determined whether
>    they denote the same value, and if the DTs are like ordered, what
>    their ordering relationship is.
>
>Level 3: Full (relations between all supported datatypes supported)
>
>    (same as level 2, but all possible valid relations between all
>     supported datatypes are supported)

And datatype classes. RDF needs to be told explicitly that one space 
is a subset of another if they are infinite; it can't be reduced to 
some set of particular dtype-value equalities.

BUt I agree about those levels, see below.

>
>An example of where one may support individual datatypes but not
>(potentially valid) relationships between other datatypes could
>be an engine that provides support for both XML Schema datatypes
>and UAProf datatypes, where in practice, these are used in a disjunct
>fashion, therefore, no relationship is defined between e.g. xsd:integer
>and uap:Number (even though there could be).
>
>Now, later, the engine could provide support for comparing
>xsd:integer values and uap:Number values, but providing support
>for *both* those datatypes does not necessarily mean it provides
>support (or should provide support) for comparision *between* those
>two datatypes.
>
>I.e., an application providing minimal datatyping support for
>two datatypes which theoretically are related is not deviant or
>broken, etc. if it does not support comparisons between the
>two datatypes.
>
>And one cannot presume some ad-hoc equality between datatype values
>just based on the native internal representation used (eg. both are stored
>as Java double values) because that is no guaruntee that they are the
>same value -- e.g. one can store xsd:boolean values and xsd:integer
>values internally as Java double values , but that does not mean
>that "1"^^xsd:boolean == "1"^^xsd:integer, eh?

Right.

>
>We cannot, and should not, say that applications are required, or
>even expected, to support datatype entailments at any level, much
>less at the full level (level 3 above).
>
>*BUT* we should be clear about the level of support presumed when
>we are defining datatype entailment tests in the test cases, *AND*
>be clear that the specified level of support is a requirement for
>the test case, but not for every RDF application dealing with
>datatypes. I think that the kernel of this is there in Jan's proposed
>verbage, but simply needs to finessed a bit.
>
>I suggest we adopt and utilize something akin to the above three
>levels of datatyping support, in a non-normative but consistent
>fashion, and qualify our datatyping test cases clearly in those
>terms.

I'd suggest the following levels, or distinctions, which correspond 
roughly to yours.

1. Local intra-type information: for each datatype d, all true 
equations of the form
Lit-1^^d = Lit-2^^d

2. Local inter-type information: As 1, but in addition all true 
equations of the form
Lit-1^^d1 = Lit-2^^d2 for d1, d2 in the set of datatypes

3. Global inter-type information: As 2., but in addition all subset 
relationships between value spaces of distinct datatypes in the set.

In addition, any of these can also supply RDF properties and 
undertake to provide extensions for those properties in the form of 
truthvalues for (some or all) triples of the form

Lit-1^^d1  <property>  Lit-2^^d2

(I know that isnt legal RDF :-) or limited to a single datatype in 
the first case, of course. This allows ordering information and 
things like arithmetic operations on integers, etc., to be provided, 
as long as the Dtype supplies a vocabulary to talk about them in. 
Then a sufficiently savvy RDF engine can draw conclusions like 
deciding that

_:x xsd:lessThan "10"^^xsd:integer .
_:x xsd:moreThan "12"^^xsd:integer .
_:x rdf:type xsd:integer .

is an XSD datatype contradiction.

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola              			(850)202 4440   fax
FL 32501           				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam
Received on Tuesday, 3 December 2002 10:58:52 UTC