Re: Semantics of non-datatyped literals: Rationale (version 1)

Brian--

I don't mean to overly-complicate things, but it's not clear to me that 
your expression of the issues captures all the variants involved (maybe 
you're trying to do one thing at a time?).  I may be wrong, but it seems 
to be that we also need to ask something like the following questions as 
well (or at least to have these kinds of questions in mind when deciding 
on our answers):

a.  what is the answer given the following RDF/XML

   <rdf:Description rdf:about="Jenny">
     <foo:age>10</foo:age>
   </rdf:Description>
   <rdf:Description rdf:about="John">
     <foo:age rdf:datatype="&xsd;integer">10</foo:age>
   </rdf:Description>

[in particular, does this result in "Jenny and John have the same age", 
"Jenny and John have different ages", or "I can't tell"]

b.  is the answer the same given the following RDF/XML

   <rdf:Description rdf:about="Jenny">
     <foo:age>10</foo:age>
   </rdf:Description>
   <rdf:Description rdf:about="Film">
     <foo:title>10</foo:title>
   </rdf:Description>

together with a schema declaration

<rdf:Description rdf:about="#age">
    <rdfs:range rdf:resource="&xsd;integer"/>
</rdf:Description>


--Frank


Brian McBride wrote:

> 
> Here is a writeup of the problem and the desiderata.
> 
> We've spent a lot of time banging heads on whether tidy or untidy is to 
> be preferred.
> 
> Please could we take a little time to take a fresh look at what we are 
> trying to accomplish and see whether there might be another solution 
> which would give more of what we want than either tidy or untidy.
> 
> I note that the main concerns I've seen so far from the untidy camp are:
> 
>   o redundancy of specifying datatype on every element
>   o easily adding datatype info to old data
> 
> Long range datatyping is a solution to meet those needs.  Perhaps they 
> can be met in another way that is compatible with the tidy entailment.
> 
> Brian
> 
> 
> The Issue
> =========
> 
> Given the following RDF/XML
> 
>   <rdf:Description rdf:about="Jenny">
>     <foo:age>10</foo:age>
>   </rdf:Description>
>   <rdf:Description rdf:about="John">
>     <foo:age>10</foo:age>
>   </rdf:Description>
> 
> Do Jenny and John have the same age?  It may appear obvious that they 
> do.  But consider a similar example:
> 
>   <rdf:Description rdf:about="Jenny">
>     <foo:age>10</foo:age>
>   </rdf:Description>
>   <rdf:Description rdf:about="Film">
>     <foo:title>10</foo:title>
>   </rdf:Description>
> 
> Though the title of the film and the age of Jenny are both written as an 
> rdf literal "10", it can be argued that the meaning of the statement 
> about Jenny's age is that her age is the integer 10, and the meaning of 
> the statement about the title of the film is that it is the string of 
> characters "10".  If that is what they mean then the age of Jenny is not 
> the same as the title of the film.
> 
> The formal definition of this question is whether, given the second of 
> the RDF fragments above, entails (implies) the following (expressed in 
> n-triples):
> 
>   <jenny> foo:age   _:l .
>   <film>  foo:title _:l .
> 
> 
> Desiderata
> ==========
> 
> This is an issue that requires a judgement between different options.  
> The WG has not found a solution to satisfy all desirable features. 
> Rather than trying to state precise requirements it is better to define 
> the considerations that bear on that judgement.
> 
> General
> -------
> 
> GD01: Inline literals - @@clarify what this means
> 
> GD02: Interoperability [Ed note:  I've added that one in the light 
> Patrick's proposal to say nothing]
> 
> 
> Use
> ---
> 
> UD01: Verbosity of expressing datatyped literals in the RDF/XML syntax.  
> [Note eventually rdf/xml will be written by tools, but for 
> bootstrapping, users see it]
> 
> UD02: CC/PP schema should not have to change
> 
> UD03: Support from RDF customers including daml/webont, rss, cc/pp, 
> dublin core, Adobe XMP, DMOZ, mozilla and Jena.
> 
> UD04: the ability to put, in a schema, constraints on the *lexical form* 
> of values (gravy, not requirement)
> 
> UD05: It should be easy to update legacy data with datatype information
> 
> UD06: Must be able to merge duplicate statements with the same literal 
> value as object (when there is an applicable range constraint and when 
> there is not)
> 
> UD07: Minimize the number of nodes and arcs to represent a datatype 
> value (scalability)
> 
> UD08: Support xml schema datatypes
> 
> UD09: Need mechanism to enable queries based on datatype values (before, 
> after, during re dates; lessthan, gtr than)
> 
> UD10: Global type inference  - this is a solution - need to figure out 
> the need
> 
> UD11: Backward compatibility for existing data and specs (dc, cc/pp, rss)
> 
> UD12: Capture as much of the information as possible in the RDF (e.g. if 
> its an integer, RDF should know its an integer)
> 
> UD13: It should be easy to explain to users.
> 
> 
> Implementation
> --------------
> 
> ID01: Minimize burden on implementors
> 
> ID02: Monotonic and sound model theory.  Complete inference is not 
> required.
> 
> ID03: Convincing evidence that the solution is implementable
> 
> ID04: Do not require implementations to maintain a hash table of literals.
> 
> UD05: not have to keep track of each different occurrence of some literal
> 
> UD06: Backward compatibility: existing implementations and applications 
> should be able to upgrade in a backward compatible way
> 
> 
> Process
> -------
> 
> PD01: Speed - the WG needs to finish soon
> 
> 
> Other
> -----
> 
> OD01: an endorsement of the practice of using datatype properties, i.e. 
> [ xsdt:date "2002-09-23"]. (gravy, not a requirement)
> 
> 


-- 
Frank Manola                   The MITRE Corporation
202 Burlington Road, MS A345   Bedford, MA 01730-1420
mailto:fmanola@mitre.org       voice: 781-271-8147   FAX: 781-271-875

Received on Monday, 30 September 2002 08:40:38 UTC