Re: Input sought on datatyping tradeoff

At 19:03 11/07/2002 -0400, Thomas B. Passin wrote:

>[Brian McBride]
> >
>
> >    <Jenny> <ageInYears> "10" .
> >    <Jenny> <testScore>  "10" .
> >
> > Should an RDF processor conclude that the value of Jenny's ageInYears
> > property is the same as the value of Jenny's testScore property?
>
>I do not think this question is well posed.

Oh dear.  And I worked so hard to get it both clear and correct.

>  On what basis will an RDF
>processor use these literal values for anything?


Let me offer you two examples.

When building an implementation, an implementer needs a notion of 
identity/equality for literals.

Secondly, consider a query processor which queries an RDF graph based on 
subgraph match:

   ?x <ageInYears> "10" .
   ?x <testScore> "10" .

where the intent of the query is to find the x's where the denotions of the 
values of the properties are the same.


>   They cannot be used as the
>subjects of statements as things stand now.  RDF as a language does not
>really provide the ability to do anything except provide a graph or data
>store of triples.  To answer the question as posed, we must imagine some
>logic processor or query engine, or RDF processor with extensions.  Are we
>then trying to imagine what would support some "reasonable" set of
>processors? Or are we really talking only about "RDFS-aware" processors?

Its reasonable to take into account logic processors or query engines that 
will be built on RDF.  Difficulties can arise in discussion where these are 
hypothetical.  More weight should be given to such processors that have 
been or are being developed, but that should not preclude taking into 
account imagined processors.

> >
> > Test A3:
> >
> >    <Jenny> <ageInYears>   "10" .
> >    <Film>  <title>        "10" .
> >
> > Should an RDF processor conclude that the value of Jenny's age property is
> > the same as the value of the Film's title property?  If the value the
> > <ageInYears> property is an integer, and the value of the <title> property
> > is a string, they are not the same thing and are thus not equal.
> >
> > The answer must be the same for all three of these A tests.
> >
>I do not see why this would be necessary

Thats a fair point.  I agree we owe you an explanation of why we think 
this.  I will respond to that in a separate thread.

>  and I object to it.  If we are to
>think about comparing literals and concluding equality, I think this should
>only be allowed if the predicates have some relationship, like one being a
>subproperty of the other.  According to this, in test A3 the two would not
>be able to be compared, any more than two times can be compared if one of
>them is in UTC and the other does not specify its UTC status.
>
>Test A1 uses the same predicate and so could be compared.

We considered this, and talked ourselves out of it.

   http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Jul/0011.html

However, even if were to accept that A1 could be answered differently from 
the others, the crux of our question would remain.  Do you prefer A2 and A3 
to be yes or D to be yes.

[...]

> > Now for a different kind of test.  How do the values of the two idioms
>relate?
> >
> > Test D:
> >
> >    <Jenny>      <ageInYears> "10" .
> >    <ageInYears> rdfs:range xsd:decimal .
> >
> >    <John>  <ageInYears>   _:a .
> >    _:a     xsdr:decimal   "10" .
> >
> > Should an RDF processor conclude that Jenny and John have the same
> > age?  [Note: in this example the range constraint is expressed using
> > rdfs:range.  We may have to introduce a special datatyping range property,
> > but that is an independent detail for now.]
> >
> > It is not possible to have the answers to Tests A and Test D both be
> > yes.  Either the A's can be yes or D can be yes, but not both.  We have to
> > decide which of these is the most important to have.
> >
>
>Why not both?

Again, we owe you an explanation for that.  In short, we have to decide 
whether the "10" always denotes a string (strictly a literal - they have a 
bit more structure), in which case test case D must be NO, or what they 
denote is determined by a some form of datatyping range constraint, in 
which case test case A, not having the range constraint, must be NO.

>  Test D is not at all the same kind of thing as Test A.  In
>Test A, we compare objects with the same or different predicates.  In Test
>D, we compare an object of a blank node where a second triple expresses a
>property or constaint on the object, with the object of another triple where
>yet another triple expresses a constraint on a predicate.  Test D is much
>more complex than Test A.  Because of that, I do not think the two are
>comparable, and so both could be "yes".

See above.


>Before settling on some answer, it would be best to spell out the semantics
>of the two Tests. You try to do that below for Test A, but say nothing about
>Test D.

Sorry for not being clear enough.  In test case D the b-node _:a denotes 
the *integer* 10, not the numeral "10".  Is that sufficient explanation?

>I do not think it is obvious how to relate a constraint on a
>predicate to a constraint on an object, that is why I ask for the semantics.
>If we knew the proposed interpretation of this comparison, it would be
>easier to assess the question.

Thanks for taking the time to address this Tom.  Please bear in mind that 
(we hope) rdf will be used by folks who are not expert logicians, so we 
need solutions that are readily understandable by those who don't have a 
phd in mathematical logic.

Brian

Received on Friday, 12 July 2002 04:15:10 UTC