RE: datatypes and MT from Patrick.Stickler@nokia.com on 2001-11-13 (w3c-rdfcore-wg@w3.org from November 2001)

From: <Patrick.Stickler@nokia.com>
Date: Tue, 13 Nov 2001 21:53:49 +0200
To: phayes@ai.uwf.edu
Cc: w3c-rdfcore-wg@w3.org
Message-ID: <2BF0AD29BC31FE46B78877321144043162177D@trebe003.NOE.Nokia.com>
> -----Original Message-----
> From: ext Pat Hayes [mailto:phayes@ai.uwf.edu]
> Sent: 13 November, 2001 19:01
> To: Stickler Patrick (NRC/Tampere)
> Cc: w3c-rdfcore-wg@w3.org
> Subject: RE: datatypes and MT
> 
> 
> >  > -----Original Message-----
> >>  From: ext Pat Hayes [mailto:phayes@ai.uwf.edu]
> >>  Sent: 07 November, 2001 23:14
> >>  To: Stickler Patrick (NRC/Tampere)
> >>  Cc: Brian McBride; w3c-rdfcore-wg@w3.org
> >>  Subject: RE: datatypes and MT
> >>
> >>
> >>  >  > Pat shoeSize "10"
> >>  >>
> >>  >>  then I am saying that my shoe size is a character 
> string, right?
> >>  >>
> >>  >>   From here on I will omit the double quotes to avoid 
> confusion.
> >>  >
> >>  >Do you mean then
> >>  >
> >>  >   Pat shoeSize 10
> >>  >
> >>  >???
> >>  >
> >>  >Such that 10 is something other than a string/literal?
> >>
> >>  The label in the triple itself is a string, of course, 
> just like all
> >>  the other labels. Urirefs are strings, too.  So '10' is a string
> >>  (which is also a numeral).  On the other hand, 10 is a 
> number, as in
> >>  '10=6+4'. The question is whether the literal label *denotes* a
> >>  string or not. I want to be able to write a numeral and have it
> >>  denote a number. If we interpret the double quotes as genuine
> >>  quotation then that is impossible: all such quotations 
> denote strings.
> >
> >I am of the opinion that we do not want to interpret the lexical form
> >of the value in RDF itself. Thus, it remains a string. To some system
> >which understands the data type (lexical space and value 
> space) associated
> >with the literal (locally or globally by rdfs:range) it may parse the
> >literal into an internal canonical representation of the 
> value and treat
> >it as a number, perform operations on it, etc.
> 
> I think we are at cross purposes. I agree with you that RDF should 
> not *alter* - parse, canonicalize, modify in any way - the literal 
> strings. But that is not the issue. That suggestion has never even 
> been mooted; there are no valid RDF inferences that would perform 
> rewriting of literals, and RDF has no way to express the identities 
> that would sanction such rewriting.  But that is not what I mean by 
> 'interpret' : I meant it in the model-theoretic sense.

I did not suggest any modification of the literal, within RDF space.

> >But RDF should *not* IMO suggest any kind of interpretation 
> of literals
> >in any way. They should remain strings.
> 
> Of course they should *remain* strings. All textual labels are 
> strings, in a sense: urirefs are strings. Anything than can be given 
> a BNF syntax consists of strings of characters. But what the literals 
> ARE, and what they MEAN, are two different questions. I am trying to 
> have a conversation about the latter, not the former.

But you have to have a conversation about both.

I take your statement to mean that the question of what literals
ARE correlates to lexical space and what the literals MEAN correlates
to value space and that to state what a literal IS is to define
an RDF class which indicates the lexical space governing its form
and to state what a literal MEANS is to define an RDF class which
indicates the value space into which it is mapped.

Right?

Most RDF classes denoting data types correlate to BOTH the lexical
space and the value space. I'm sure that there are upper-ontologies
of value spaces that are independent of lexical realization (perhaps
defined by IEEE, etc.).

The rdfs:subClassOf relation has, I believe, already been shown
by my earlier examples to define a relation ONLY between value
spaces. This means that, whether a given RDF class denotes both
a value space and a lexical space, or only a value space, any
inferences about the valid membership of a given typed value in
a superordinate class has only to do with value space.

This then means that lexical space is only relevant when actually
parsing the RDF encoded literal in order to actually map it to
an internal value in a given system/application. All other
logical operations based on type, subClassOf, subPropertyOf
and range need not worry about lexical space AS LONG AS when
the time comes to interpret (i.e parse/map) the literal by
an application, it knows the lexical space for which it is
defined.


> >
> >>  >So just how does a system *know* how to interpret
> >>  >'0x12' as an xsd:integer?
> >>
> >>  It doesn't, probably. Look, I'm not expecting miracles to 
> occur. All
> >>  that the MT extension can do is to make sure that datatyping
> >>  information is tallied up with literal labels in a sensible way.
> >
> >Right. I agree. The key is to bind the data type class with 
> the literal
> >in a persistent fashion (this includes both local and global typing)
> >and make sure that the application which interprets the literal gets
> >all the information it needs.
> 
> Right. But once that binding is done successfully, should we say that 
> the literal refers to itself, or to the value assigned to it by the 
> datatyping scheme to which it is bound? 

Not at all. The binding, if it is based on the interpretation of
subPropertyOf and subClassOf relations, is between a property and
a literal (syntax) not between a concept and a value (semantics).

> Should the literal "10", 
> known to be bound to the datatype xsd:integer, be taken to mean a 
> character string with two elements, or the number ten?

To RDF, it is a character string bound to a property. To some
application, it likely will be value bound to a concept. But in
order for that literal string to be parsed and mapped to an
internal representation that the application can treat as a value,
the application must know the lexical space which the literal
form corresponds to, and that is defined in terms of the data
type either specified locally for the value or via rdfs:range
for the property of the original statement.

So in the latter case, if the binding process has lost the
connection between the literal and that original property,
the application cannot reliably interpret the literal as
it cannot be sure that the lexical space of the property to
which it is bound is a proper superset of the lexical space
of the original property -- though it can be sure that the
value spaces are compatible.

Patrick


> Pat
> -- 
> ---------------------------------------------------------------------
> IHMC					(850)434 8903   home
> 40 South Alcaniz St.			(850)202 4416   office
> Pensacola,  FL 32501			(850)202 4440   fax
> phayes@ai.uwf.edu 
> http://www.coginst.uwf.edu/~phayes
>
Received on Tuesday, 13 November 2001 14:54:09 UTC