Re: The emperor's new datatypes from Brian McBride on 2002-11-27 (w3c-rdfcore-wg@w3.org from November 2002)

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: Wed, 27 Nov 2002 14:55:28 +0000
To: Graham Klyne <GK@NineByNine.org>, RDF core WG <w3c-rdfcore-wg@w3.org>
Message-Id: <5.1.0.14.0.20021127144551.0334cf38@0-mail-1.hpl.hp.com>
I just peeked into the list after a couple of days of working on other 
stuff and this was the first message I saw.  I am responding to it without 
having seen other recent traffic.

Have we not discussed different ways of doing datatyping enough?

The issues that Graham addresses here are closed.  I hope the WG will have 
the self discipline to recognise that, and that this will be the only 
response on the list to Graham's post.

Brian




At 13:23 27/11/2002 +0000, Graham Klyne wrote:

>I was going to hold my silence in support of WG consensus.  I still 
>support the WG consensus, but now the matter of datatyping has come up on 
>the list [1][2] I feel compelled to state my opinion lest silence be 
>regarded as agreement with some other view.
>
>[1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Nov/0629.html
>[2] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Nov/0634.html
>
>I do not have new information, and I am not asking for any issue to be 
>reopened.  But I feel I should now put my opinion on the record.  I don't 
>plan to get into an argument about the rightness of my views, at least not 
>here, unless the group as a whole decides the reopen the datatyping design.
>
>...
>
>THE EMPEROR'S NEW DATATYPES
>---------------------------
>
>I believe we took a wrong turn when we did the datatyping design.  I don't 
>think the decision is technically fatal, but having had some time to stand 
>back and consider the larger picture, I believe we have introduced a lot 
>of unnecessary complexity.
>
>As an alternative, DanC proposed a system [3] of datatype "interpretation 
>properties" and constraints on string literals, which has the advantage of 
>being a much smaller change to RDF than the current proposal.
>
>[3] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Oct/0031.html
>
>The key decision we made was that literals are tidy.  Having made that 
>decision, I think the choice between ways to handle literals corresponding 
>to datatype values, such as integers, etc., was somewhat rushed.  In a 
>private message drafted about the time we took that decision I included an 
>analysis of the impact of the different tidy-literal options on CC/PP, 
>which I reproduce below, in which I conclude that Dan's proposal works 
>just as well for CC/PP as the current proposal.
>
>Later, I argue that I think even Dan's proposal might be slightly 
>simplified (in terms of adding less mechanism to RDF).
>
>...
>
>TIDY LITERALS IN CC/PP
>----------------------
>
>This analysis of changes to CC/PP is predicated on tidy literals being a 
>done deal.
>
>Dan's point, if I represent it correctly, is that we don't need typed
>literals because if we assume datatype properties relating values to
>lexical forms, we can use them as "interpretation properties".
>
>That is, the intent of the expression:
>
>    jenny age xsd:integer"10" .
>
>can be equivalently expressed as:
>
>    jenny age _:x .
>    _:x xsd:integer "10" .
>
>The values of IEXT(I(age)) that make the first form true are exactly those
>that make the second true.
>
>The only substantive objection to this that I can think of is the concern
>for triple-bloat; e.g. integer valued-properties now need two triples
>instead of one.  (I this that optimization is quite possible, but
>would be some extra work for implementers.)
>
>So if I now refer to my thoughts about redesigning CC/PP for use with tidy
>literals [4], where I wrote:
>
>[[
>   <prf:displayWidth
> 
>rdf:datatype="http://www.w3.org/2001/XMLSchema-datatypes#integer">604</prf:displayWidth>
>   <prf:displayHeight
> 
>rdf:datatype="http://www.w3.org/2001/XMLSchema-datatypes#integer">200</prf:displayHeight>
>]]
>
>I would instead write something like:
>[[
>           <prf:displayWidth>
>              <rdf:Description>
>                 <xsd:integer>604</xsd:integer>
>              </rdf:Description>
>           </prf:displayWidth>
>           <prf:displayHeight>
>              <rdf:Description>
>                 <xsd:integer>200</xsd:integer>
>              </rdf:Description>
>           </prf:displayHeight>
>]]
>
>or just:
>[[
>           <prf:displayWidth xsd:integer="604" />
>           <prf:displayHeight xsd:integer="200" />
>]]
>
>Syntactically, the first form is quite a big wrench for CC/PP and the
>second is no worse, indeed more attractive, than my previous redesign [4],
>but (on the assumption of literals always denoting string values) would
>only affect properties with non-string values.  Properties with string
>values would not be affected, and I *think* that's reasonably true of many
>of the UAProf attributes.
>
>Technically, as I indicated in [4], I think this kind of approach is the
>best way forward for CC/PP in RDF, and would, I believe, lay the
>appropriate groundwork for a full content negotiation framework based on
>CC/PP, using CONNEG-like ideas for capability matching [5], probably based
>on OWL vocabulary.
>
>[4] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Sep/0353.html
>[5] http://www.ietf.org/rfc/rfc2533
>
>...
>
>SIMPLIFYING DANC'S PROPOSAL
>---------------------------
>
>In [3], DanC proposed a new construct, rdfs:format for indicating the 
>intended form of literals used as objects of a property.
>
>[3] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Oct/0031.html
>
>I think a very similar effect can be achieved using other defined RDF 
>constructs:
>
>   my:age rdfs:format xsdt:integer .
>
>I think could equivalently be expressed as:
>
>   my:age rdfs:range _:x .
>   xsdt:integer rdfs:range _:x .
>
>[I think someone else suggested this on the list, but I don't know where 
>so I'm unable to cite acknowledgement -- sorry.]
>
>which we can now express in RDF/XML thanks to the rdf:nodeId construct we 
>introduced.
>
>This is presented as a semantic constraint, but it seems quite plausible 
>to me that under the same conditions that we now call datatype entailment, 
>a generic RDF processor could recognize that:
>
>    ex:Jenny my:age "foo" .
>
>is not satisfiable under the known properties of xsdt:integer.  So while 
>this is strictly a semantic constraint, it could reaonable be detected and 
>flagged by an RDF parser, in a fashion similar to the way that C compilers 
>may perform flow analysis to detect and warn at compile type the use of an 
>uninitialized variable.
>
>...
>
>SUMMARY
>-------
>
>In summary, I think that the same effect that we currently achieve can 
>also be achieved with a considerable simplification of RDF.  I think the 
>main technical arguments against are efficiency-based, and I think that 
>(in time) smart implementations could overcome those by recognizing and 
>optimizing some common cases (cf. relational databases).
>
>I repeat: in this message I'm not asking the WG to change anything, but 
>stating a viewpoint for the record, because the topic has been mentioned.
>
>#g
>--
>
>[1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Nov/0629.html
>[2] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Nov/0634.html
>[3] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Oct/0031.html
>[4] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Sep/0353.html
>[5] http://www.ietf.org/rfc/rfc2533
>
>
>
>-------------------
>Graham Klyne
><GK@NineByNine.org>
Received on Wednesday, 27 November 2002 09:54:03 UTC