Re: Proposal to incorporate datatyping into the model theory (was Re: datatyping discussion) from Graham Klyne on 2001-10-24 (w3c-rdfcore-wg@w3.org from October 2001)

From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
Date: Wed, 24 Oct 2001 17:23:31 +0100
To: Pat Hayes <phayes@ai.uwf.edu>
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <5.1.0.14.2.20011024122406.00a0daf0@joy.songbird.com>
At 09:10 PM 10/23/01 -0500, Pat Hayes wrote:

>>    ex:shoe shoe:size "10" .

[...]

>>Now, if I have knowledge (ex-RDF) that
>>     _:a shoe:size "10" .
>>and
>>     _:a shoe:size "10.0" .
>>are equivalent statements, then I can infer from the truth of the above 
>>example that
>>    ex:shoe shoe:size "10.0" .
>>is also true, without knowing that the literals used have a numeric data 
>>type.
>
>If you really do know that equivalence holds for *any* _:a, but that's a 
>universal quantification. How you gonna say that in RDF?

Er, I'm not.  I did say "ex-RDF".

>But let me provide a tweak. Suppose that the equivalence between 10 and 
>10.0 is provided only in the datatype mapping itself. A datatype is 
>defined semantically by a mapping from lexical space to value space, so it 
>is quite possible for that mapping to make this kind of identification. In 
>the model theory we currently have, the inference you want (from   ex:shoe 
>shoe:size "10" .  to   ex:shoe shoe:size "10.0" .) would be valid only if 
>something provided enough typing information to enable to reasoner to know 
>that both these literal occurrences were numbers, say. BUt even then, a 
>*reasoner* would have to somehow be able to make the inference by 
>accessing information about the datatype mapping, and the MT doesn't say 
>anything about HOW to do that. It just says the inference is 
>'datatype-entailed'.

Yes, I think all this can work with knowledge of datatypes (which I think 
is also "ex-RDF" in that it's not fully describable "in" RDF (as currently 
understood).

My niggle was that implementations would need full knowledge of datatyping 
to use this kind of information, which might be a large swallow for some 
implementations...

>>You mentioned in another message the idea of a "default" datatype, in 
>>which literal strings denote themselves;  maybe the above issue is 
>>resolved by requiring that this default datatype be present in every 
>>interpretation?  Or:  any RDF that is true in some datatyped 
>>interpretation is also true in a corresponding interpretation with 
>>default datatyping.
>
>Thats not true, but what IS true is that any datatyping can be 'added' to 
>the default in a certain sense. In the string-default (it only works for 
>strings) the datatyping of a character string is the character string, ie 
>the datatyping is the identity mapping. So eg
>   ex:shoe shoe:size "10" .
>says that the size of the shoe is the character string '10'. Which isn't 
>actually right, presumably;

... I suppose not.  I was trying to chase down the idea that it meant the 
value was something that could be represented by the string.  I see now 
that default identity interpretation idea doesn't work.  And all roads seem 
to lead back to some form of datatyping in the sense of a data type that 
sets limits on the possible things denoted by a literal.

The reason that I've latched on to this shoe size example is that, for many 
purposes, the literal value is all one needs to know about this 
property.  Crudely, I know that a show size of "11" fits me, and other shoe 
sizes do not.  I can also recognize that "11.0" may be an alternative name 
for the size that fits me.  I may also know that size "12" is too large, 
and size "10" is too small.  But beyond this, I have no idea what the 
number actually means, and for me to reason about shoe sizes I don't need 
to know this.

Thus, while the shoe size may nominally be a numeric literal, I don't need 
a full understanding of numeric values to make basic decisions about my 
shoe size.

>but what is right is what we would get by reinterpreting I(shoe:size) to 
>mean 'the size of the shoe is something denoted by the string', (ie rather 
>than the string itself) which is really the bNode version of the literal 
>(Sergey's S3). So what is true is that if a graph has a satisfying 
>interpretation I under some datatyping T , then it has a closely related 
>satisfying interpetation I' under the default datatyping, such that if you 
>kind of semantically compose T  with I' on these literal-anonymous nodes, 
>then you get the original I+T back again. So one can think of S3 as this 
>kind of 'interim' datatyping where you insist on interpreting literals as 
>themselves, and treat the places where they occur as saying that something 
>exists with that name. It provides a kind of general-purpose 
>put-the-typing-decision-off strategy.

Yes.  After writing the above, I think this is very close to what I'm 
trying to describe.

A tangential question:  in model-theoretic terms, is the datatyping system 
T to be considered as  a part of an interpretation, or something that is 
separate from an interpretation so that interpretation and datatyping 
system are both needed to determine the truth of a graph?  My current sense 
is that I+T should be regarded as an interpretation, so that all the 
language about being true in an interpretation and entailment still works.

>I realize this is incoherent, but will try to write it up coherently 
>before Friday.

That will be interesting.

#g


------------------------------------------------------------
Graham Klyne                    MIMEsweeper Group
Strategic Research              <http://www.mimesweeper.com>
<Graham.Klyne@MIMEsweeper.com>
------------------------------------------------------------
Received on Wednesday, 24 October 2001 13:07:43 UTC