W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > June 2002

datatypes message - draft 2

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: Wed, 26 Jun 2002 18:38:20 +0100
Message-Id: <>
To: RDF Core <w3c-rdfcore-wg@w3.org>

This is a new draft of this message based on feedback recieved.  I'd like 
to approve this for sending at this Friday's telecon.

Changes are:

  o minor change to the INTRODUCTION
  o explicitly stated B and C must have the same answer
  o removed angle brackets from around qname representation of uri's
  o pointed out test cases B and C depend on no range constraints
  o added deadline for responses


The RDFCore WG is producing a proposal for how XML Schema datatypes should 
be used in RDF.  We would like some guidance on a particular tradeoff we 
have to make.

The WG requests that you send your considered answers to 
www-rdf-comment@w3.org, along with any comments, thoughts or questions you 
may have.  Please can we have all responses by 12 July 2002.


Let's explain the basic ideas behind our approach to datatyping.  The aim 
is to define how datatype values, e.g. integers, dates etc should be 
represented in RDF.  It is important in getting the semantics correct that 
we distinguish between a datatype value, e.g. the integer 10 and a lexical 
representation of the value, e.g. the string "10".

We are proposing two principal idioms for representing datatyped 
information.  The first looks like this:

   <Jenny> <age>          _:a .
   _:a     <xsdr:decimal> "10" .

This can be written in RDF/XML like this.

   <rdf:Description rdf:about="Jenny">
     <foo:age xsdr:decimal="10"/>

Here the b-node _:a denotes the integer 10 which can be represented in 
decimal form as the string "10".

This idiom treats an XML datatype as a mapping from a value to a lexical 
representation of the value; this mapping is represented in RDF by a property.

We believe this idiom to be quite straightforward, but not sufficient on 
its own because it is common practise to write things like:

   <jenny> <age> "10" .

where the author of this fragment of RDF means to represent the fact that 
Jenny's age is the number 10.  This is the second idiom, which is where we 
need some guidance.


It is here that we need some advice, because we have a choice to make in 
the way we define the formal semantics.

A few simple test cases:

Test A:

   <Jenny> <ageInYears> "10" .
   <John>  <ageInYears> "10" .

Should an RDF processor conclude that the value of the ageInYears 
properties for Jenny and John are the same?

Test B:

   <Jenny> <ageInYears> "10" .
   <Jenny> <testScore>  "10" .

Should an RDF processor conclude that the value of Jenny's ageInYears 
property is the same as the value of Jenny's testScore property?

Note that this question only relates to the situation where there are no 
range constraints.  Given compatible range constraints on the properties, 
there is no difficulty concluding that the answer is yes.

Test C:

   <Jenny> <ageInYears>   "10" .
   <Film>  <title>        "10" .

Should an RDF processor conclude that the value of Jenny's age property is 
the same as the value of the Film's title property?  If the value the 
<ageInYears> property is an integer, and the value of the <title> property 
is a string, they are not the same thing and are thus not equal.

Again this question only relates to the situation where there are no range 
constraints on the properties.  Given the appropriate range constraints on 
the properties, the answer is clearly no.

Test D:

   <Jenny>      <ageInYears> "10" .
   <ageInYears> rdfs:range xsd:decimal .

   <John>  <ageInYears>   _:a .
   _:a     xsdr:decimal   "10" .

Should an RDF processor conclude that Jenny and John have the same 
age?  [Note: in this example the range constraint is expressed using 
rdfs:range.  We may have to introduce a special datatyping range property, 
but that is an independent detail for now.]

It is not possible to have the answers to Test B, Test C and Test D all be 
yes.  B and C must also have the same answer.  Either B and C can be yes or 
D can be yes.  We have to decide which of these is the most important to 
have; (B and C) or D.


The formal semantics can define the meaning of a literal in one of two ways:

   tidy) the <ageInYears> property takes a value which is a numeral, i.e. a 

   untidy) the <ageInYears> property takes a value which is some datatype 
value whose string  representation is "10", but without further 
information, such as
a range constraint, we can't tell exactly what the value is, e.g. the 
string might be in octal.

If we choose the tidy option, the object of the statement is always a 
string, which means that in:

   <Jenny> <ageInYears> "10" .
   <Film>  <title>      "10" .

the values of the two properties are the same; they are both the STRING "10".

If we choose the untidy option, the value of the object object of the 
statement is unknown from this statement alone; a range constraint is 
required to determine the value from the literal string:

   <jenny>      <ageInYears> "10" .
   <ageInYears> <rdfs:range> <xsd:decimal> .

With a range constraint, we can know that the object of the property is the 
integer 10.


To end then, please send a message to www-rdf-comments@w3.org (by 12 July 
2002) indicating whether you believe its more important to have the answer 
to test case B be yes, or test case D be yes:

   Test B:

   <Jenny> <ageInYears> "10" .
   <Jenny> <testScore>  "10" .

Test D:

   <Jenny>      <ageInYears> "10" .
   <ageInYears> <rdfs:range> <xsd:decimal> .

   <John>  <ageInYears>      _:a .
   _:a     <xsdr:decimal>   "10" .

We would also like to know the reasons for this preference.

Brian McBride
on behalf of the RDFCore WG
Received on Wednesday, 26 June 2002 13:39:23 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:24:13 UTC