- From: Michael Schneider <schneid@fzi.de>
- Date: Sat, 7 Dec 2013 01:07:58 +0100
- To: Pat Hayes <phayes@ihmc.us>
- CC: Guus Schreiber <guus.schreiber@vu.nl>, "public-rdf-comments@w3.org Comments" <public-rdf-comments@w3.org>
Dear Pat, Dear Working Group, we had settled on treating ISSUE-165 during the CfI phase, and I wanted to first create my implementation report and find an opportunity to get more into the details of the draft of the semantics before giving an answer to the WG answer. Here is is my answer now. Before I come to replying to the particular WG answers, I want to bring up another issue that I have found only during the CfI phase. In my original LCWD comment, I had only swiftly checked the precise changes concerning datatypes; my main argument was more against the change of the nomenclature and formal representation from a datatype map to a set of recognizing IRIs. Now, after a more in-depth check, I have to say that I have now also technical problems with this change. Let's assume we have a semantic extension of D-RDFS, called "D-X", with several datatype IRIs in D: D := { xsd:string, xsd:integer, ... } In the RDF 2004 spec, the analog entailment regime would have been defined w.r.t. a datatype map D, which would be a set of /pairs/ (u,d), where u a IRI and d a datatype. In our case, these datatypes d would be somehow represented as references to the corresponding sections in the XSD Datatypes spec, telling the characteristic aspects of these datatypes, including their lexical spaces, value spaces, and the mapping from literals to values. In the RDF 2004 spec, both the datatype IRIs and their associated datatypes would be fixed for D-X. So for any D-X interpretation I, the denotation of u, I(u), equals d. In contrast, in RDF 1.1, D would contain the IRI u instead of the pair (u,d), and, as D is a set of recognizing IRIs, I would know that for any D-X interpretation I, there exists /some/ datatatype d with I(d). However, I would /not/ know what the datatype d is, except perhaps from additional information given in the handbook for D-X, but by means that are outside the RDF specification. Concerning entailments, the way I have originally read the new draft, was that for a given semantic extension D-X, it is possible for a datatype IRI d in D to have different denotations (i.e. datatypes) under different D-X-interpretations I1 and I2, and, in fact, the actual datatype would be completely unspecified in this reading. This would then cancel out most datatype-related entailments compared to RDF 2004, in which for any pair (u,d) in a datatype map D of D-X, the denotation of u under any D-X-interpretation I would always be defined to be the same datatype, namely I(u) = d. I am sure that such a reading is not what the WG intends, but the only sentence I could find about what might have been intended is in Chapter 7: """ We assume that a recognized IRI identifies a unique datatype wherever it occurs, and the semantics requires that it refers to this identified datatype. """ Now, this is an extremely vage and confusing sentence, and I have still no idea if I understand it. With regard to what is uniqueness meant here? What is meant by "the semantics requires" something? The sentence should probably simply be dropped. But then, nothing else is being said about the datatypes associated to the "recognizing" IRIs, and this would then, of course, bring back my destructive reading above. So, in my original reading, by replacing datatype maps with sets of recognized IRIs, half of the required information has been lost, or at least, the explicit support by the specification has been removed. It is clear that from a simple set of IRIs alone, there is no way to know what the IRI denotes, and thus what the expected semantics of an RDF graph with literals is meant to be under the D-X semantics. Consequently, the documentation of D-X would have to come up with some custom means of saying which the IRIs in D denote. But then, there /would/ be the pairs of IRIs and datatypes again, essentially at least, just in a way unsupported by the spec. I don't believe that it was really the intention of the WG to support such a source of confusion. So far for the new point. Now to the particular WG answers (quoted by >), where I will come back to this and my original argument again. > Regarding ISSUE-165, this matter was debated > extensively within the WG, and most of your > points were made during this discussion. > (see http://lists.w3.org/Archives/Public/public-rdf-wg/2013Jun/0085.html > and subsequent threads.) First to say, I do not see in the cited mail exchange any discussion about my original argument that at least three other core Semantic Web standards, namely SPARQL 1.1, OWL 2, and RIF, are reusing the original definition of RDF datatype maps, and thus interoperability with these standards will thus be directly affected. If you make the change in the RDF spec, then the current versions of these other specs will be bound to the old version of the RDF standard and will be formally incompatible with the current one. Even if the revised definition of datatype maps is intended to "mean basically the same thing", the other specifications will still be incompatible with the new definition in a strictly technical sense: They use a different formal representation and a different nomenclature for the associations of IRIs and their denoted datatypes, and so one will always have to explain the translation between the two formalism. And when the time comes for new revisions of these other specs, it has to be decided by these other WGs to either follow the new approach, or to stick with the old one. From a pov of the whole Semantic Web, the first option is of course what should be done, so, in essence, by applying this change in the RDF spec, the RDF WG essentially forces the other specifications into the same change as well. Hence, the RDF WG is in high responsibility here and should do a change only when there is clear motivation for it, and when it can be foreseen that the change will be easily accepted by future WGs of the other specs. Neither do I see any clear motivation for the change, nor would I expect that such future WGs will easily accept this change. However, I can see that my new technical point given above had, in its essence, already been brought up by Antoine Zimmermann in the first point of his review cited above. As far as I was able to follow the heated discussion there, it goes pretty much in circles, and is more of a series of attempts to convince the other party of their preferences, including BIG LETTERS, after which Antoine eventually gave in. So this is not so much what I would normally think of being an "extensive WG discussion". Anyways, what I can see as the essence of this discussion is that you consider the change to be semantically compatible with the old version, and that it is meant to only b a small change. Even if I accept this (which would require me to have a different reading of the draft than the one I give above), it is still the case that you change the formal representation underlying datatype semantics from a set of pairs of IRIs and datatypes into a set of IRIs and some additional text indicating the understanding of the association between these IRIs and their denoted datatypes. I do not consider this to be a small thing at all! To me, this is comparable to changing the syntax of, say, the assignment construct of a programming language, from the widely used "reference=value" style into something where you just declare the reference, and require that these references get their value somehow, by a means which is outside the language spec. You may argue that you can still write exactly the same kinds of programs with the revised language, which may really be the case, but to the price that any existing software written in this language will not compile anymore under the revised version, any existing compiler needs to be rewritten, same for any textbook on that language, and all professional programmers have to learn the new construct, wasting some of their precious productive time. And after all, the change would be widely considered completely unesseary, because the old construct worked perfectly well and was in wide use, while the new one may even lead to confusion. Back to the change in RDF, if you really think that the semantic consequences are the same and that it is a minor change, then why the change at all? In particular, given that such a change will break formal compatibility with other existing Semantic Web standards for no added value? > The primary reason for the change was to simplify > the presentation of the RDF semantics, which was > an overarching goal of the WG. The primary goal of any W3C WG should be to comply with the WG charta, which, in the case of the RDF WG, explicitly requires that "changing the fundamentals of the RDF Semantics" are out of scope for the WG (Chapter 3). The scope of the RDF WG, according to the charta, was "to extend RDF to include some of the features that the community has identified as both desirable and important for interoperability based on experience with the 2004 version of the standard, but without having a negative effect on existing deployment efforts." Now see what you are about to do here: You want to change a basic formal aspect of the original RDF standard, which will break interoperability with several other core Semantic Web standards! But let's talk about your argument of simplification. I do not agree that this change counts as a considerable simplification at all, rather the opposite. I originally expected that the semantic conditions of datatype semantics, which really have always been particularly easy to understand, would have changed as well. But, as I found, they are still essentially the same (modulo adjustments to the new notion of recognizing IRIs). So what you really only change here is to make the original datatype map, which was a set of pairs consisting of an IRI and a datatype, into a set of IRIs with some additional text telling that the IRIs have to denote their corresponding datatypes somehow. So you have changed something that is represented in a very standard way and perfectly clear to understand into something that is certainly not clearer, and to me, as I stated above, even confusing. In any case, such a kind of change certainly does not justify a deviation from what has been used by several other Semantic Web standards. > The actual mathematics has not altered, as the > 2004 semantics required D-interpretation mappings > to conform to the datatype map, so the datatype map > is simply a part of (a restriction of) > the interpretation mapping itself. Even if I would agree that the current draft can be read this way, it is still the case that the formal representation has changed, which breaks interoperability with existing Semantic Web standards. And again, if there is really hardly a change, why do we need the change at all? > Once this is recognized, it is clearly simpler to > treat it in this way rather than as a separate mapping. It should be clear by now that I disagree with this view. The original way was perfectly clear to me, while the new one is at least confusing to me. But, apart from personal preferences, even if it really is a simplification, then the simplification would be much too small to justify breaking interoperability with existing standards. > In addition, it had been noted by several commentors > that the 2004 definitions allowed for 'pathological' > D mappings, such as one which permutes the meanings > of the XSD datatype IRIs. It was felt that > disallowing such maps was a laudable by-product > of the change. Now, this argument surprises me, and there are two answers to this. Firstly, the problem cannot be that big, given the fact that in the ca 10 years since the original RDF standard at least three other core SW standards have been written which reuse the original notion of datatype maps without problems, each taking years of specing work and building up considerable experience with these things. This provides strong evidence to me that things are sufficiently fine with datatype maps. As far as I am concerned myself, I have been responsible for editing one of these specifications (the OWL 2 RDF-Based Semantics), which makes heavy use of the original definitions for datatype and datatype maps. I have provided technical advise to the editors of SPARQL Entailment Regimes and RIF RDF&OWL Compatibility among other things with regard to datatype related semantics. I have created several large test suites, which are partially about datatype semantics. I have created many formal proofs based on the datatype semantics of RDF. I have spend some time thinking about implementation of datatype semantics in the past, although not yet implemented into my RDF Semantics reasoner. And overall I have been working in the RDF field fulltime continuously for the last 8 years up to the day. But in all these years with all this gained experience concerning RDF Semantics in general and RDF datatype semantics in particular, I have never encountered any serious problems with the original notion of datatype maps. Rather, I have always found the original datatype semantics well designed and it allowed me to do my work decently. I would never have come to the conclusion that anything would require a change, in particular not a change of the kind proposed in RDF 1.1. For me, the old saying holds that "If it ain't broke, don't fix it!" Secondly, whatever these unknown commenters were about, let me say that no change of the semantics whatsoever will save us from people doing strange or silly things with datatypes, if they only want to. I can easily, for example by applying owl:sameAs to two value-space-incompatible datatype IRIs, do all kinds of crazy things in the 2004 spec as well as in the new draft. So the "pathological" argument is most probably moot. > We also note that this change does not alter any > entailments. Again, this depends on the reading of the current draft. In my reading, most datatype-related entailments would be removed. In the reading according to the discussion cited above, nothing would change semantically. Either way, no change should be made then. To summarize, even if I give in to the reading of the current draft as stated in the cited discussion thread, there is still the problem that a fundamental aspect of the old RDF model-theoretic semantics has changed concerning its nomenclature and formal representation, which is used in the original form by at least three other core standards of the Semantic Web. Further, even if I agree with the reading of the WG, I do not agree that there was any need for such a change, as the old spec was perfectly clear and this is clearly confirmed by its use in several other standards that have been produced over the years, and by my own long-year experience in the matter. I further do not agree that the given change is a simplyfication but, rather, I consider it to be pretty confusing. In any case, I see no justification for this change to break interoperability with three other Semantic Web standards, which is, of course, to me the most important reason to reject this change. But if the WG still thinks that the change is appropriate, there is, by no means, any urge to apply it now, but it can still be postponed to a later WG, which would also allow to have more discussion, in particular with regard to the other standards that use the original datatype semantics. I therefore kindly ask the WG to revert the change and bring back the old notion of a datatype map consisting of pairs of IRIs and datatypes, with the necessary adjustments to the corresponding semantic conditions. Best Regards, Michael Schneider
Received on Saturday, 7 December 2013 00:08:25 UTC