RE: Input sought on datatyping tradeoff

> ...
> It is important in getting the semantics correct that we distinguish 
> between a datatype value, e.g. the integer 10 and a lexical 
> representation of the value, e.g. the string "10".

It seems to me that RDF has tied its surface triple syntax so tightly
to the model theory, essentially using triples both for abstract and
concrete syntax, that it will be hard to differentiate 10 from "10",
other than via a bnode mechanism like the one you propose below.

> We are proposing two principal idioms for representing datatyped 
> information.  The first looks like this:
> 
>    <Jenny> <age>          _:a .
>    _:a     <xsdr:decimal> "10" .
> 
> This can be written in RDF/XML like this.
> 
>    <rdf:Description rdf:about="Jenny">
>      <foo:age xsdr:decimal="10"/>
>    </rdf:Description>

I think something like the above is what you must do.  

NOTE: As I understand it, to explicitly specify a type in XML you
use xsi:type.  E.g.

<rdf:Description rdf:about="Jenny">
  <foo:age xsi:type="xsdr:decimal">10</foo:age>
</rdf:Description>

The RDF XML exchange syntax would need to explain that this creates a
bnode.  And it would create uglier triples:

    <Jenny>   <foo:age>     _:a .
    _:a       <value>       "10" .
    _:a       <xsi:type>    <xsdr:decimal> .

(Or something similar.)

I mention this because it seems that the following would be an
extemely desirable property:

  An XML processor reading an RDF\XML fragment should identify the 
  same XML datatype for the literal "10" as an RDF processor.

> ...
> Test A:
> 
>    <Jenny> <ageInYears> "10" .
>    <John>  <ageInYears> "10" .
> 
> Should an RDF processor conclude that the value of the ageInYears 
> properties for Jenny and John are the same?

Yes.

> ...
> Test A2:
> 
>    <Jenny> <ageInYears> "10" .
>    <Jenny> <testScore>  "10" .
> 
> Should an RDF processor conclude that the value of Jenny's 
> ageInYears property is the same as the value of Jenny's testScore 
> property?

Yes.

> ...
> Test A3:
> 
>    <Jenny> <ageInYears>   "10" .
>    <Film>  <title>        "10" .
> 
> Should an RDF processor conclude that the value of Jenny's age 
> property is the same as the value of the Film's title property?  If 
> the value the <ageInYears> property is an integer, and the value of 
> the <title> property is a string, they are not the same thing and 
> are thus not equal.

It is hard to imagine how these can be considered different.  If so,
then you get no equality at all for literals that are not typed.

In XML you can use the schema definition to determine exactly what the
type of the datatype ground terms are.  Without a schema or explicit
xsi:type tags, these elements are just strings. (I think.)  Because a
schema is essentially a type definition, it can impose a type on an
input string in a natural way.  

> The answer must be the same for all three of these A tests.  

Yes.

> Thes test cases only relates to the situation where there are no range
> constraints on the properties.  Now for a different kind of test.
> How do the values of the two idioms relate?  Test D: <Jenny>
> <ageInYears> "10" .  <ageInYears> rdfs:range xsd:decimal .  <John>
> <ageInYears> _:a .  _:a xsdr:decimal "10" .  Should an RDF processor
> conclude that Jenny and John have the same age? 

No.

> ...
>    <jenny>      <ageInYears> "10" .
>    <ageInYears> <rdfs:range> <xsd:decimal> .
> 
> With a range constraint, we can know that the object of the property 
> is the integer 10.

If you provide range restrictions for <ageInYears> and <title> you can
deduce that "10" is an element of BOTH xsdr:decimal and xsdr:string
(given that you extend range restrictions to datatypes).

Which presumably violates the semantics of XML Dataypes.  Thus
requiring your first approach.

Of course we don't want to forget that '10' is just the lexical
representation of some value.  It might represent an integer, a binary
number, a float (10.0), hex, or a string of characters.  But it seems
very un-RDF-like to try to use the range restriction in the second
triple above to syntactically coerce "10" to 10.

While you could use the syntax above to show that "10" represents the
integer 10, it seems to me that you really need to insert a bnode to
get a value for <ageInYears> that can be determined to NOT be a
string, e.g. not equal to jenny's <title>.

    <jenny> <ageInYears>    _:a .
    _:a     <value>         "10" .
    _:a     <xsi:type>      <xsdr:decimal> .

    <movie> <title>         _:b .
    _:b     <value>         "10" .
    _:b     <xsi:type>      <xsdr:string> .

    <jenny> <fingerCount>   _:c .
    _:c     <value>         "10" .
    _:c     <xsi:type>      <xsdr:decimal> .

We can now prove that _:a NE _:b, since we hopefully have an axiom
that states that values of xsdr:decimal and xsdr:string are disjoint.
And we can provide axioms to support proving that _:a EQ _:c.  

One open question is whether "10" in the triple below is of 
type <xsdr:string>.

    <movie> <title> "10" .

- Mike

Michael K. Smith, Ph.D., P.E.
Member, W3C Web Ontology WG
EDS - Austin Innovation Centre
98 San Jacinto, #500
Austin, TX  78701

* phone: +01-512-404-6683
* mailto:michael.smith@eds.com
www.eds.com

Received on Monday, 15 July 2002 09:12:55 UTC