RE: heading toward datatyping telecon

> -----Original Message-----
> From: ext Brian McBride [mailto:bwm@hplb.hpl.hp.com]
> Sent: 06 November, 2001 20:00
> To: Stickler Patrick (NRC/Tampere)
> Subject: Re: heading toward datatyping telecon
> 
> 
> 
> 
> Patrick.Stickler@nokia.com wrote:
> 
> [...]
> 
> 
> > 
> > Because the RDF graph model preserves the lexical form of such
> > typed data values, and yet (in present form) does not make the
> > association between lexical form (literal) and data type
> > immutable, so to speak, there is no guarantee that a given
> > application finally getting that value (via e.g. an inference
> > engine that does no interpretation of data types) that it
> > will have all the information it needs to properly interpret
> > the value.
> 
> 
> A use case demonstrating the problem would be helpful all round.
> 
> Brian

OK, here's the input data:

<rdfs:Class rdf:ID="integer/>

<rdfs:Class rdf:ID="hexInteger">
   <rdfs:subClassOf rdf:resource="#integer"/>
</rdf:Class>

<rdf:Property rdf:ID="intProperty">
   <rdfs:range rdf:resource="#integer"/>
</rdf:Property>

<rdf:Property rdf:ID="hexProperty">
   <rdfs:range rdf:resource="#hexInteger"/>
   <rdfs:subPropertyOf rdf:resource="intProperty"/>
</rdf:Property>

<rdf:Description rdf:about="#foo">
   <hexProperty>0x12</hexProperty>
</rdf:Description>

The 'integer' lexical form requires decimal notation.
The 'hexInteger' lexical form requires hexidecimal notation.

--

Now, someone does a query which asks "what is the
value of the 'intProperty' property of '#foo'?" And
based on the subClassOf and subPropertyOf relations
defined in our scheme, we can deduce the following
binding:

<rdf:Description rdf:about="#foo">
   <intProperty>0x12</intProperty>
</rdf:Description>

We wouldn't expect the inference engine to be looking
at literals to ensure that they are valid lexical
forms, no? It has simply provided a fully logical 
binding of a value to a property.

Unfortunately, because the true data type of the literal
is not inseperable from the literal, we are left with
having only the range definition for the intProperty
property, which of course will try to interpret the literal
as if it is in decimal notation.

Boom!  ;-)

Now, if instead we had our data defined as follows:

<rdf:Description rdf:about="#foo">
   <hexProperty rdf:resource="xxx:hexint:0x12"/>
</rdf:Description>

then the logical binding of 

<rdf:Description rdf:about="#foo">
   <intProperty rdf:resource="xxx:hexint:0x12"/>
</rdf:Description>

poses no problems. The value itself, being encoded as a URV,
retains all necessary type information for interpretation
no matter where it is bound, and the binding is fully valid,
because the "Object" (with capital 'O') "xxx:hexint:0x12"
*is* a valid subinstance of 'integer' insofar as the value
space is concerned -- and all information is present for its
proper parsing into any canonical internal representation for
that space.

Now, one may argue (as I myself already have) that hexInteger
is not a "proper" data type, but only a notational variant 
of the integer data type -- but so long as RDF does not
constrain the definition of such type hierarchies which do
not maintain validity of lexical forms between sub and super
types, this problem remains, as illustrated above. And I'm
sure there are other examples of such problems where the data
type would be considered by all to be legit, though none come
to mind at the moment.

Patrick

Received on Wednesday, 7 November 2001 00:41:20 UTC