Re: Input sought on datatyping tradeoff

Jonathan,

I'm having a little trouble interpreting your answer for the summary I'm 
currently trying to write.

My current interpretation of what you write is that you support the tidy 
position, i.e. that "10" always denotes the literal "10" in the model 
theory, and that you suggest that there should be extra interpretation 
functions which can map the literal to a value.

I have a choice:

   o Do I interpret you as, given the choice put, preferring the tidy 
option, prioritising the inappropriately named "duh entailment"

   o Do I conclude that I am not sufficiently certain of your response to 
count it one way or the other.

I have chosen the later course, until I hear further from you.

Brian


At 08:53 12/07/2002 -0400, Jonathan Borden wrote:
>Brian McBride wrote:
>
> >
> > It is important in getting the semantics correct that we distinguish
> > between a datatype value, e.g. the integer 10 and a lexical representation
> > of the value, e.g. the string "10".
>
>Yes, "10" = "10"
>
> >
> > We are proposing two principal idioms for representing datatyped
> > information.  The first looks like this:
> >
> >    <Jenny> <age>          _:a .
> >    _:a     <xsdr:decimal> "10" .
> >
> > This can be written in RDF/XML like this.
> >
> >    <rdf:Description rdf:about="Jenny">
> >      <foo:age xsdr:decimal="10"/>
> >    </rdf:Description>
>
>right.
>
> >
> > Here the b-node _:a denotes the integer 10 which can be represented in
> > decimal form as the string "10".
>
>
> > We believe this idiom to be quite straightforward, but not sufficient on
> > its own because it is common practise to write things like:
> >
> >    <jenny> <age> "10" .
>
>The danger in interpreting this idiom in any way other than
>
>age = "10"
>
>is non-monotonicity. That is in the absence of _some other triples_ i.e. a
>schema, the object of the age predicate is the literal string "10". Great
>care needs to be taken that any other triples which affect this equality or
>interpretation of the string, are either _always_ present or _never_
>present/considered else non-monotonicity.
>
>That is if I know:
>
><jenny> <age> "10"
>
>no later information should change that fact or interpretation of that fact.
>
>
> >
> > A few simple test cases:
> >
> > Test A:
> >
> >    <Jenny> <ageInYears> "10" .
> >    <John>  <ageInYears> "10" .
> >
> > Should an RDF processor conclude that the value of the ageInYears
> > properties for Jenny and John are the same?
>
>yes.
>
> >
> > There are variations on this test which should be considered before
>answering.
> >
> > Test A2:
> >
> >    <Jenny> <ageInYears> "10" .
> >    <Jenny> <testScore>  "10" .
> >
> > Should an RDF processor conclude that the value of Jenny's ageInYears
> > property is the same as the value of Jenny's testScore property?
>
>yes.
>
> >
> > Test A3:
> >
> >    <Jenny> <ageInYears>   "10" .
> >    <Film>  <title>        "10" .
>
>yes.
>
> >
> > Should an RDF processor conclude that the value of Jenny's age property is
> > the same as the value of the Film's title property?  If the value the
> > <ageInYears> property is an integer, and the value of the <title> property
> > is a string, they are not the same thing and are thus not equal.
>
>where has it been monotonically defined to be an integer vs. string? That is
>the crux of the entire issue.
>
> >
> > The answer must be the same for all three of these A tests.
>
>agreed.
>
> >
> > Now for a different kind of test.  How do the values of the two idioms
>relate?
> >
> > Test D:
> >
> >    <Jenny>      <ageInYears> "10" .
> >    <ageInYears> rdfs:range xsd:decimal .
> >
> >    <John>  <ageInYears>   _:a .
> >    _:a     xsdr:decimal   "10" .
> >
> > Should an RDF processor conclude that Jenny and John have the same
> > age?  [Note: in this example the range constraint is expressed using
> > rdfs:range.  We may have to introduce a special datatyping range property,
> > but that is an independent detail for now.]
>
>this _so far_ looks ok i.e. "yes"
>
> >
> > It is not possible to have the answers to Tests A and Test D both be
> > yes.  Either the A's can be yes or D can be yes, but not both.  We have to
> > decide which of these is the most important to have.
>
>why not? surely this is what the model theory is for, to _understand_ what
>that the <rdfs:range> property has a magic meaning.
>
>one could have two different types of equality -- string eq and value equal
>(ala LISP).
>
> >
> >
> > WHY THESE TEST CASES MATTER
> > ===========================
> >
> > The formal semantics can define the meaning of a literal in one of two
> > ways, given:
> >
> >    <Jenny> <ageInYears> "10" .
> >
> >    tidy) the <ageInYears> property takes a value which is a numeral, i.e.
>a
> > string
> >
> >    untidy) the <ageInYears> property takes a value which is some datatype
> > value whose string  representation is "10", but without further
> > information, such as
> > a range constraint, we can't tell exactly what the value is, e.g. the
> > string might be in octal.
> >
> > If we choose the tidy option, the object of the statement is always a
> > string, which means that in:
> >
> >    <Jenny> <ageInYears> "10" .
> >    <Film>  <title>      "10" .
> >
> > the values of the two properties are the same; they are both the STRING
>"10".
> >
> > If we choose the untidy option, the value of the object of the statement
>is
> > unknown from this statement alone; a range constraint is required to
> > determine the value from the literal string:
> >
> >    <jenny>      <ageInYears> "10" .
> >    <ageInYears> <rdfs:range> <xsd:decimal> .
> >
> > With a range constraint, we can know that the object of the property is
>the
> > integer 10.
>
>again, you have two different tests:
>
>string-eq
>
>value-equal
>
>just distinguish between the two and let people/inferencing engines do what
>they want
>
> >
> > CONCLUSION
> > ==========
> >
> > To end then, please send a message to www-rdf-comments@w3.org (by 26 July
> > 2002) indicating whether you believe its more important to have the answer
> > to test cases A be yes, or test case D be yes:
> >
> >    Test A:
> >
> >    <Jenny> <ageInYears> "10" .
> >    <John>  <ageInYears> "10" .
>
>=> true (absolutely)
>
>otherwise you fail the "duh!" test.
>
>i'd like to say (functionally)
>
>eq( ageInYears(Jenny) , ageInYears(John) )
>
> >
> > Test D:
> >
> >    <Jenny>      <ageInYears> "10" .
> >    <ageInYears> <rdfs:range> <xsdr:decimal> .
> >
> >    <John>  <ageInYears>      _:a .
> >    _:a     <xsdr:decimal>   "10" .
> >
> >
>
>=> true (qualified)
>
>i'd say:
>
>value-equal( ageInYears(Jenny), ageInYears(John) )
>
>note that "value-equal" might be non-monotonic if the <rdfs:range> propery
>got detatched from the other triples -- but there is a danger of this type
>of behavior almost every time we depend on more than one triple for an
>inference!
>
>Jonathan

Received on Thursday, 18 July 2002 06:39:17 UTC