Re: Input sought on datatyping tradeoff

From: Brian McBride <bwm@hplb.hpl.hp.com>
Subject: Re: Input sought on datatyping tradeoff
Date: Fri, 12 Jul 2002 09:40:26 +0100

> At 19:24 11/07/2002 -0400, Peter F. Patel-Schneider wrote:
> >Unfortunately, I think that this request is incorrectly formulated.
> 
> Oh dear.  And I tried so hard to be both clear and correct :(
> 
> >   I
> >think this for several reasons.
> >
> >1/ The request does not mention some of the unusual aspects of XML
> >Schema datatypes, such as union datatypes and the ability to override
> >the normal typing of literals in union datatypes.  The presence of
> >these unusual features in XML Schema datatypes makes them much harder
> >to handle.
> 
> You are correct that the question does not mention these.  For now we have 
> only been considering primitive XML schema datatypes.  If you could show 
> that these other datatypes steer the decision decisively in one direction 
> or the other, that would be a great contribution.

If you are only considering primitive XML Schema datatypes then you need to
make this clear.

I don't know whether union types steer the decision in our direction or
another, but they do mean that subproperties can usefully have ranges that
have divergent lexical to value maps.

> >2/ The request begs question A by using semantically-loaded terms,
> >like ``ageInYears''.  This is only partly alleviated by stating that
> >the answer must be the same for question A, A2, and A3.  It would be
> >much better to use a property, like rdf:object, that does not have a
> >natural range type that is associated with it.  If rdf:object is not
> >used, then some other property without such as strong natural range
> >type should be used instead, perhaps even one like
> >``ageInYears-or-title''.
> 
> The use of these "semantically-loaded" terms is intended to make the issue 
> clear to folks without getting into mathematics.  I want input from folks 
> who would be put off by a technical mathematical discussion.  A3 for 
> example, is intended to get folks to ask themselves whether they really 
> think of the <ageInYears> property as denoting an integer or a string.
> 
> I take your point about using rdf:object though I'm leery of that because 
> we could (I believe would) get ourselves tangled up in issues around the 
> semantics of reification which would obscure rather than clarify the issue.
> 
> I agree with you though, that we should make the point to folks that an rdf 
> processor doesn't know <ageInYears> from <uuid:a;lskdjalkjd>.

Even so, having the question itself use ageInYears biases the results.

> 
> >3/ The request contains a number of unsupported assertions, including
> >``[t]he answer must be the same for all three of these A tests'' and
> >``[i]t is not possible to have the ansewrs to Tests A and Test D both
> >be yes.''  I think that these assertions need justification.
> 
> I agree we owe an explanation of these assumptions.  I surely wish we could 
> square the circle and remove these constraints.  RDFCore has struggled with 
> datatypes for many months now.  We have not yet found a way and believe we 
> are forced into making this choice.
> 
> I'd be happy to review with you how we got here and will start a thread on 
> rdf-logic for that purpose.

I have been following the RDF Core mailing lists, but even so I was
surprised by these assertions.  They certainly are not obvious and,
further, appear on the surface to be wrong.

> >4/ The request does not describe the implications of answers, except
> >that as yes to Question A must also be a yes to Questions A2 and A3.
> >There are many implications of the answers to these questions, and
> >responses by responders who are not aware of the implications may
> >change if the responders are made aware of the implications of their
> >responses.
> 
> What implications do you have in mind?

Well, for example, how would this all impact a query system for RDF?  How
would it impact an extension to RDF, like OWL, that has a stronger notion
of equality than RDF does?

> >5/ The request does not indicate why other attractive solutions to
> >datatypes are not being considered.  One such solution would be to
> >require that all literals be types, perhaps by using xsi:type
> >constructs.
> 
> Please could you construct a short description of this idea and send it to 
> rdf comments.

I will do, but the basic idea is quite simple, requiring that all literals
in RDF graphs be types, not just strings. If this is done, then there are
no semantic issues with respect to datatypes, only syntax issues.

> >I request that the request be reformulated to address my concerns and
> >then resent to the mailing lists.
> 
> I am hopeful that the request as formulated will lead to useful input to 
> the WG and would like to press ahead with it as is.  In particular, I am to 
> get input from the wider community, including, but not only, from those who 
> don't have a PhD in mathematical logic.  DPH, please speak up.
> 
> Brian

peter

Received on Friday, 12 July 2002 09:24:51 UTC