Re: datatyping

[Patrick Stickler, Nokia/Finland, (+358 40) 801 9690, patrick.stickler@nokia.com]


----- Original Message ----- 
From: "ext pat hayes" <phayes@ai.uwf.edu>
To: <w3c-rdfcore-wg@w3.org>
Sent: 03 December, 2002 20:47
Subject: datatyping


> 
> Let me summarize a proposal for exactly what we should say about datatypes.
> 
> 1. A datatype is assumed to be identified by a uriref. The assertion
> 
> aaa rdf:type rdfs:Datatype .
> 
> is intended to be interpreted by a datatype-savvy RDF engine as an 
> indication that aaa is the uriref of a datatype, and that it is 
> appropriate to attempt to access the information associated with that 
> datatype. The exact form in which this information is to be provided 
> to an RDF engine should be specified as part of the API of any such 
> engine.
> 
> Such an assertion does not constitute a definition of a datatype. 
> There is no way to define a datatype in RDFS. Datatypes are defined 
> externally to RDFS.
> 
> 2. In order to be useful, some information  about a datatype needs to 
> be provided to a datatype-savvy RDF engine. The information is of 
> various kinds, and some datatypes may provide only part of the 
> information. Insofar as information about the datatype is 
> unavailable, a datatype-savvy RDF engine will be able to draw only 
> the same conclusions as a non-datatype-savvy RDF engine. Or, if you 
> like, stated semantically, datatype entailment is defined relative to 
> the information provided by the datatype information source. If you 
> get more information, you can make more inferences; if you get none, 
> then the datatype adds nothing and you are just doing RDFS. That way, 
> RDFS entailment is like datatype entailment with an empty-information 
> datatype.
> 
> 3a. The minimal kind of information is a specification of which 
> literals are syntactically correct, ie in the lexical space of the 
> datatype, and which are not.
> This information being unobtainable for a resource which is asserted 
> to be in the class rdfs:Datatype may be considered an error condition.
> 3b. The second kind of information is a specification of which 
> literals map to the same value in the datatype. This information can 
> be conceptualized as a set of equations between typed literals with 
> the same type:
> "aaa"^^ddd = "bbb"^^ddd .

Fine so far...

> but it may also be provided, for example, by giving a mapping from 
> lexical forms to canonical lexical forms.

I'm uncomfortable with this statement. To me it suggests that a 
value space is optional and that a datatype can define only a 
lexical space and a canonical lexical space; and that is not correct.

It may be the case that "9" and "10" are canonical lexical forms
for xsd:integer, but these canonical forms are not members of the
value space, and e.g. ordered comparision of lexical forms is not the
same as ordered comparision of values -- i.e. 9 < 10 but "10" < "9".

We do not define nor care about canonical lexical forms in RDF
datatyping.

> 3c. The third kind of information is like 3b, but specifies 
> identities between forms under different datatypes:
> "aaa"^^ddd = "bbb"^^eee .
> This may be provided, for example, by giving schematic mappings 
> between canonical lexical forms of the different datatypes under 
> various boundary conditions.

Again, we need to speak in terms of values, not canonical lexical forms.

> 3d. The fourth kind of information is subset relationships between 
> value spaces of different datatypes. This can be specified directly 
> by RDFS subclass assertions of the form
> ddd rdfs:subClassOf eee .
> 
> Information of type 3a enable inferences of the form
> 
> aaa ppp "xxx"^^ddd .
> ->
> aaa ppp _:x
> _:x rdf:type ddd .
> 
> and hence is often sufficient to detect datatype clashes
> 
> Information of types 3b enables inferences of the form
> aaa ppp "xxx"^^ddd .
> -->
> aaa ppp "yyy"^^ddd .
> 
> Information of type 3c enables inferences of the form
> 
> aaa ppp "xxx"^^ddd .
> -->
> aaa ppp "yyy"^^eee .
> 
> Information of type 3d allows RDFS class reasoning to support 
> inferences of the form
> 
> aaa ppp "xxx"^^ddd .
> -->
> aaa ppp _:z .
> _:z rdf:type eee .
> 
> --------
> 
> Is that OK?

Other than having to toss all mention of canonical lexical forms, it 
looks great.

Patrick

 
> Pat
> 
> -- 
> ---------------------------------------------------------------------
> IHMC (850)434 8903   home
> 40 South Alcaniz St. (850)202 4416   office
> Pensacola               (850)202 4440   fax
> FL 32501            (850)291 0667    cell
> phayes@ai.uwf.edu           http://www.coginst.uwf.edu/~phayes
> s.pam@ai.uwf.edu   for spam
> 

Received on Wednesday, 4 December 2002 06:18:36 UTC