RE: datatypes and MT from Pat Hayes on 2001-11-14 (w3c-rdfcore-wg@w3.org from November 2001)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Tue, 13 Nov 2001 20:31:47 -0600
To: Patrick.Stickler@nokia.com
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p05101048b81783a23b34@[65.212.118.147]>
>  > -----Original Message-----
>>  From: ext Pat Hayes [mailto:phayes@ai.uwf.edu]
>>  Sent: 13 November, 2001 19:01
>>  To: Stickler Patrick (NRC/Tampere)
>>  Cc: w3c-rdfcore-wg@w3.org
>>  Subject: RE: datatypes and MT
>>
>>
>>  >  > -----Original Message-----
>>  >>  From: ext Pat Hayes [mailto:phayes@ai.uwf.edu]
>>  >>  Sent: 07 November, 2001 23:14
>>  >>  To: Stickler Patrick (NRC/Tampere)
>>  >>  Cc: Brian McBride; w3c-rdfcore-wg@w3.org
>>  >>  Subject: RE: datatypes and MT
>>  >>
>>  >>
>>  >>  >  > Pat shoeSize "10"
>>  >>  >>
>>  >>  >>  then I am saying that my shoe size is a character
>>  string, right?
>>  >>  >>
>>  >>  >>   From here on I will omit the double quotes to avoid
>>  confusion.
>>  >>  >
>>  >>  >Do you mean then
>>  >>  >
>>  >>  >   Pat shoeSize 10
>>  >>  >
>>  >>  >???
>>  >>  >
>>  >>  >Such that 10 is something other than a string/literal?
>>  >>
>>  >>  The label in the triple itself is a string, of course,
>>  just like all
>>  >>  the other labels. Urirefs are strings, too.  So '10' is a string
>>  >>  (which is also a numeral).  On the other hand, 10 is a
>>  number, as in
>>  >>  '10=6+4'. The question is whether the literal label *denotes* a
>>  >>  string or not. I want to be able to write a numeral and have it
>>  >>  denote a number. If we interpret the double quotes as genuine
>>  >>  quotation then that is impossible: all such quotations
>>  denote strings.
>>  >
>>  >I am of the opinion that we do not want to interpret the lexical form
>>  >of the value in RDF itself. Thus, it remains a string. To some system
>>  >which understands the data type (lexical space and value
>>  space) associated
>>  >with the literal (locally or globally by rdfs:range) it may parse the
>>  >literal into an internal canonical representation of the
>>  value and treat
>>  >it as a number, perform operations on it, etc.
>>
>>  I think we are at cross purposes. I agree with you that RDF should
>>  not *alter* - parse, canonicalize, modify in any way - the literal
>>  strings. But that is not the issue. That suggestion has never even
>>  been mooted; there are no valid RDF inferences that would perform
>>  rewriting of literals, and RDF has no way to express the identities
>>  that would sanction such rewriting.  But that is not what I mean by
>>  'interpret' : I meant it in the model-theoretic sense.
>
>I did not suggest any modification of the literal, within RDF space.

I know. But you emphasized this point in a way that suggested that 
you thought that someone else was suggesting the opposite. My point 
was that nobody has made any such suggestion.

>
>>  >But RDF should *not* IMO suggest any kind of interpretation
>>  of literals
>>  >in any way. They should remain strings.
>>
>>  Of course they should *remain* strings. All textual labels are
>>  strings, in a sense: urirefs are strings. Anything than can be given
>>  a BNF syntax consists of strings of characters. But what the literals
>>  ARE, and what they MEAN, are two different questions. I am trying to
>>  have a conversation about the latter, not the former.
>
>But you have to have a conversation about both.

Thank you for your instruction in how to hold a conversation. I will 
try to bear it in mind in future.

>
>I take your statement to mean that the question of what literals
>ARE correlates to lexical space and what the literals MEAN correlates
>to value space

OK so far, though I'm not sure what "correlate" means as a verb; but....

>and that to state what a literal IS is to define
>an RDF class which indicates the lexical space governing its form
>and to state what a literal MEANS is to define an RDF class which
>indicates the value space into which it is mapped.
>
>Right?

No. At least I don't think so. I do not really know what you are 
saying here, to be honest.

>
>Most RDF classes denoting data types

Well hold on. It isn't clear that there are any RDF classes denoting 
datatypes, at present. In the S proposal, for example, datatype names 
are RDF property names, not class names. So I do not know from what 
population you are getting 'most' here.

>correlate to BOTH the lexical
>space and the value space. I'm sure that there are upper-ontologies
>of value spaces that are independent of lexical realization (perhaps
>defined by IEEE, etc.).
>
>The rdfs:subClassOf relation has, I believe, already been shown
>by my earlier examples to define a relation ONLY between value
>spaces.

Well, between classes, in the MT. Membership in such classes is fully 
determined by the value space, yes, since that is the class 
extension. But a class may have properties that are not fully 
determined by its extension.

>This means that, whether a given RDF class denotes both
>a value space and a lexical space, or only a value space, any
>inferences about the valid membership of a given typed value in
>a superordinate class has only to do with value space.

No. If a class *denotes* a lexical space, then membership in that 
lexical space will be relevant, obviously.

>
>This then means that lexical space is only relevant when actually
>parsing the RDF encoded literal in order to actually map it to
>an internal value in a given system/application.

No, that conclusion is unwarranted. The lexical space might be 
relevant in many circumstances, eg it might enable a reasoner to 
conclude that the literal value was in some class, or was identical 
to something else. The potential uses of information are many and 
various, and are not restricted to checking conformity to a data 
model for some typed application.

>All other
>logical operations based on type, subClassOf, subPropertyOf
>and range need not worry about lexical space AS LONG AS when
>the time comes to interpret (i.e parse/map) the literal by
>an application, it knows the lexical space for which it is
>defined.
>
>
>>  >
>>  >>  >So just how does a system *know* how to interpret
>>  >>  >'0x12' as an xsd:integer?
>>  >>
>>  >>  It doesn't, probably. Look, I'm not expecting miracles to
>>  occur. All
>>  >>  that the MT extension can do is to make sure that datatyping
>>  >>  information is tallied up with literal labels in a sensible way.
>>  >
>>  >Right. I agree. The key is to bind the data type class with
>>  the literal
>>  >in a persistent fashion (this includes both local and global typing)
>>  >and make sure that the application which interprets the literal gets
>>  >all the information it needs.

Wait: up to 'and make sure' then I agree, but that particular kind of 
application is only one possible use for information encoded in RDF, 
not the only possible use. Some applications may not 'interpret' the 
literal at all, but still may depend for their operation on the 
particular literal string. Cwm is an example.

>  >
>>  Right. But once that binding is done successfully, should we say that
>>  the literal refers to itself, or to the value assigned to it by the
>>  datatyping scheme to which it is bound?
>
>Not at all. The binding, if it is based on the interpretation of
>subPropertyOf and subClassOf relations, is between a property and
>a literal (syntax) not between a concept and a value (semantics).

Part of the very idea of a literal (as opposed to a uriref) is that 
its semantic value depends only on its form and the datatyping scheme 
in use, and not on other aspects of the RDF interpretation. So if the 
literal itself is 'bound' to a datatyping scheme, then the semantic 
value of the literal is established. Now, the question arises, is 
that semantic value a string or (say) a number? The various proposals 
answer that question differently.

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Tuesday, 13 November 2001 21:31:49 UTC