from rdf-logic: Fwd: Input sought on datatyping tradeoff

>Resent-Date: Thu, 11 Jul 2002 14:48:19 -0400 (EDT)
>X-Sender: europe1\bwm@15.144.25.13
>Date: Thu, 11 Jul 2002 19:47:24 +0100
>To: www-rdf-logic@w3.org
>From: Brian McBride <bwm@hplb.hpl.hp.com>
>X-MailScanner: Found to be clean
>Subject: Input sought on datatyping tradeoff
>Resent-From: www-rdf-logic@w3.org
>X-Mailing-List: <www-rdf-logic@w3.org> archive/latest/3013
>X-Loop: www-rdf-logic@w3.org
>Sender: www-rdf-logic-request@w3.org
>Resent-Sender: www-rdf-logic-request@w3.org
>List-Id: <www-rdf-logic.w3.org>
>List-Help: <http://www.w3.org/Mail/>
>List-Unsubscribe: <mailto:www-rdf-logic-request@w3.org?subject=unsubscribe>
>
>
>The RDFCore WG is producing a proposal for how XML Schema datatypes 
>should be used in RDF.  We would like some guidance on a particular 
>tradeoff we have to make.
>
>The WG requests that you send your considered answers to 
>www-rdf-comment@w3.org.  Please can we have all responses by 26th 
>July 2002.  Questions and discussion should take place on this list.
>
>INTRODUCTION TO DATATYPES
>=========================
>
>Let's explain the basic ideas behind our approach to datatyping. 
>The aim is to define how datatype values, e.g. integers, dates etc 
>should be represented in RDF.  We are building on the XML Schema 
>datatypes specification.
>
>It is important in getting the semantics correct that we distinguish 
>between a datatype value, e.g. the integer 10 and a lexical 
>representation of the value, e.g. the string "10".
>
>We are proposing two principal idioms for representing datatyped 
>information.  The first looks like this:
>
>   <Jenny> <age>          _:a .
>   _:a     <xsdr:decimal> "10" .
>
>This can be written in RDF/XML like this.
>
>   <rdf:Description rdf:about="Jenny">
>     <foo:age xsdr:decimal="10"/>
>   </rdf:Description>
>
>Here the b-node _:a denotes the integer 10 which can be represented 
>in decimal form as the string "10".
>
>This idiom treats an XML schema datatype as a mapping from a value 
>to a lexical representation of the value; this mapping is 
>represented in RDF by a property.
>
>We believe this idiom to be quite straightforward, but not 
>sufficient on its own because it is common practise to write things 
>like:
>
>   <jenny> <age> "10" .
>
>where the author of this fragment of RDF means to represent the fact 
>that Jenny's age is the number 10.  This is the second idiom, which 
>is where we need some guidance.
>
>
>SOME TEST CASES
>===============
>
>It is here that we need some advice, because we have a choice to 
>make in the way we define the formal semantics.
>
>A few simple test cases:
>
>Test A:
>
>   <Jenny> <ageInYears> "10" .
>   <John>  <ageInYears> "10" .
>
>Should an RDF processor conclude that the value of the ageInYears 
>properties for Jenny and John are the same?
>
>There are variations on this test which should be considered before answering.
>
>Test A2:
>
>   <Jenny> <ageInYears> "10" .
>   <Jenny> <testScore>  "10" .
>
>Should an RDF processor conclude that the value of Jenny's 
>ageInYears property is the same as the value of Jenny's testScore 
>property?
>
>Test A3:
>
>   <Jenny> <ageInYears>   "10" .
>   <Film>  <title>        "10" .
>
>Should an RDF processor conclude that the value of Jenny's age 
>property is the same as the value of the Film's title property?  If 
>the value the <ageInYears> property is an integer, and the value of 
>the <title> property is a string, they are not the same thing and 
>are thus not equal.
>
>The answer must be the same for all three of these A tests.
>
>These test cases only relates to the situation where there are no 
>range constraints on the properties.
>
>Now for a different kind of test.  How do the values of the two idioms relate?
>
>Test D:
>
>   <Jenny>      <ageInYears> "10" .
>   <ageInYears> rdfs:range xsd:decimal .
>
>   <John>  <ageInYears>   _:a .
>   _:a     xsdr:decimal   "10" .
>
>Should an RDF processor conclude that Jenny and John have the same 
>age?  [Note: in this example the range constraint is expressed using 
>rdfs:range.  We may have to introduce a special datatyping range 
>property, but that is an independent detail for now.]
>
>It is not possible to have the answers to Tests A and Test D both be 
>yes.  Either the A's can be yes or D can be yes, but not both.  We 
>have to decide which of these is the most important to have.
>
>
>WHY THESE TEST CASES MATTER
>===========================
>
>The formal semantics can define the meaning of a literal in one of 
>two ways, given:
>
>   <Jenny> <ageInYears> "10" .
>
>   tidy) the <ageInYears> property takes a value which is a numeral, 
>i.e. a string
>
>   untidy) the <ageInYears> property takes a value which is some 
>datatype value whose string  representation is "10", but without 
>further information, such as
>a range constraint, we can't tell exactly what the value is, e.g. 
>the string might be in octal.
>
>If we choose the tidy option, the object of the statement is always 
>a string, which means that in:
>
>   <Jenny> <ageInYears> "10" .
>   <Film>  <title>      "10" .
>
>the values of the two properties are the same; they are both the STRING "10".
>
>If we choose the untidy option, the value of the object of the 
>statement is unknown from this statement alone; a range constraint 
>is required to determine the value from the literal string:
>
>   <jenny>      <ageInYears> "10" .
>   <ageInYears> <rdfs:range> <xsd:decimal> .
>
>With a range constraint, we can know that the object of the property 
>is the integer 10.
>
>CONCLUSION
>==========
>
>To end then, please send a message to www-rdf-comments@w3.org (by 26 
>July 2002) indicating whether you believe its more important to have 
>the answer to test cases A be yes, or test case D be yes:
>
>   Test A:
>
>   <Jenny> <ageInYears> "10" .
>   <John>  <ageInYears> "10" .
>
>Test D:
>
>   <Jenny>      <ageInYears> "10" .
>   <ageInYears> <rdfs:range> <xsdr:decimal> .
>
>   <John>  <ageInYears>      _:a .
>   _:a     <xsdr:decimal>   "10" .
>
>
>We would also like to know the reasons for this preference.
>
>Brian McBride
>on behalf of the RDFCore WG


-- 
Professor James Hendler				  hendler@cs.umd.edu
Director, Semantic Web and Agent Technologies	  301-405-2696
Maryland Information and Network Dynamics Lab.	  301-405-6707 (Fax)
Univ of Maryland, College Park, MD 20742	  240-731-3822 (Cell)
http://www.cs.umd.edu/users/hendler

Received on Friday, 12 July 2002 08:30:42 UTC