- From: Ivan Herman <ivan@w3.org>
- Date: Tue, 25 Feb 2014 16:26:07 +0100
- To: Michael Schneider <schneid@fzi.de>
- Cc: "public-rdf-comments@w3.org Comments" <public-rdf-comments@w3.org>, W3C Chairs of RDF WG <team-rdf-chairs@w3.org>, Tim Berners-Lee <timbl@w3.org>
- Message-Id: <645973D9-1DD2-4784-AB8F-39E0792F971D@w3.org>
Dear Michael, As you may have seen, the RDF 1.1 Recommendations have just been published. The official mail sent to the AC representatives also includes a reaction to this mail as follows: [[[ As reported in the Call for Review, Michael Schneider raised an objection that was not upheld by the Director. The objection was reiterated in a public email [2]. The Director has discussed the objection with the Working Group Chairs and the Team Contact, and supports their determination that the Working Group has considered the issue fully. ]]] Ivan On 09 Feb 2014, at 23:49 , Michael Schneider <schneid@fzi.de> wrote: > To the director of the W3C, > to the chairs and W3C team members of the RDF Working Group, > to the members of the RDF Working Group, > and to anyone else to whom it may concern. > > This is a formal objection to a change made to the semantics of > datatypes in the Proposed Recommendation of the RDF 1.1 Semantics. > The change concerns the replacement of the original concept of > a "datatype map" by the concept of a "set of recognized datatype > IRIs". I will argue that this change is largely unmotivated and > unnessesary, technically incompatible with the original concept, > questionable and even flawed, and may lead to diverse problems > for dependent Semantic Web standards and other dependent work. > My proposal will be to revert the change to the original > definition as of 2004 and to postpone further discussion of > the change to a future RDF Working Group. This formal objection > follows my reviews of earlier versions of the RDF 1.1 Semantics > and my discussions with the RDF Working Group about the same > topic, which did not lead to a satisfiable conclusion for me. > > Michael Schneider, > Frankfurt am Main (Germany), 9 Febrary 2014 > > > == Introduction == > > This is a formal objection to a change made by the RDF Working > Group to the semantics of datatypes in the RDF 1.1 Semantics > compared to the original RDF Semantics specification as > of 2004 (from now on called "RDF 2004") [01]. The formal > objection targets the Proposed Recommendation (PR) of the > RDF 1.1 Semantics [02], which still underwent some changes > compared to the previous versions of the document, and which > is now intended by the Working Group to become the final > recommendation. The formal objection follows my reviews > of earlier versions of the RDF 1.1 Semantics and my > discussions with the RDF Working Group about the same > topic [03][04], which did not lead to a satisfiable > conclusion for me [05]. I have to point out that this formal > objection is not made by an official W3C member organisation, > and none of the organisations I am affiliated with > or in some current relationship with has is involved. > Rather, the formal objection is made by me as a private person, > and as a member of the informal Semantic Web community, > who has considerably contributed to the Semantic Web initiative > in the past and has a strong background and a stake > particularly in the RDF Semantics; see the section > "About the Author" for information about me. > > The change to which I formally object concerns the replacement > of the original concept of a "datatype map" in Chap. 5 of [01] > by the concept of a "set of recognized datatype IRIs" > in Chap. 7 of [02]. In the original RDF 2004 Semantics, > a datatype map has been a set of associations between datatype > IRIs (originally URI references) and datatypes. In the RDF 1.1 > Semantics PR, there is now a "set of recognized datatype IRIs", > that is, only the datatype IRIs, together with the additional > requirement of the existence of a globally unique mapping > between datatype IRIs and datatypes (where this unique mapping > is not intended to be fully defined by the RDF 1.1 spec). > I will describe the chnge in more detail in Section > "Description of the Change". > > I will first argue, in Section "A Non-Editorial Change", that the > change is not simply an editorial change, and will give arguments, > in Section "Missing Motivation and Necessity for the Change", why > I consider the change unmotivated and unnecessary. In Section > "Technical Consequences of the Change", I will list what I > consider the most relevant technical consequences of the change, > and will also give examples for possible practical consequences. > I will then, in Section "Consequences for dependent Semantic Web > Standards and other Work", argue that the change may have > unfortunate consequences for other existing Semantic Web Standards, > which are based on RDF, such as OWL 2, SPARQL 1.1, and RIF, > and may possibly lead to a split situation, where some of future > versions of these standards will adopt the change made in RDF > while others may not. > > Finally, in Section "Conclusions and Proposal", I will summarize > my arguments and argue that the consequences to be expected from > the change are strong and undesiarable, and would not exist if > the original notion of datatype maps would have been retained. > Consequencly, I will propose to revert the change to the original > situation as of RDF 2004, and to postpone further discussion of > the change to a future RDF Working Group. > > > == Description of the Change == > > RDF 2004 introduced the concept of a "datatype map", "being a set > of pairs of a IRI and a datatype such that no IRI appears twice > in D" (Chap. 5 of [01]; note: in order to ease the discussion, > I use the term "IRI" everywhere, although the RDF 2004 spec used > the term "URI reference" instead.) In the current PR of the > RDF 1.1 Semantics, D is not a set of IRI-datatype pairs anymore, > but a set of datatype IRIs only (Chap. 7 of [02]). It is also > not called a "datatype map" anymore, but is now called a > "set of recognized datatype IRIs". > > The RDF 1.1 Semantics further states that (a) "the semantics > presumes that a recognized IRI identifies a unique datatype > wherever it occurs", and (b) that "the exact mechanism by > which an IRI identifies a datatype is considered to be > external to the semantics" (beginning of Chap 7). > The second Change Note in Chap. 7 informally elaborates > on this statement by saying that "the current semantics > presumes that a recognized IRI identifies a unique datatype, > this IRI-to-datatype mapping is globally unique and externally > specified". In contrast, RDF 2004 did not require a globally > unique association between datatype IRIs and datatypes. > Rather, the definition of datatype maps made it possible to > have IRI-datatype associations being unique only locally with > regard to a particular datatype map D, or, likewise, locally > unique to an entailment regime that uses a particular datatype > map D. > > To illustrate the difference, consider the case of a custom > definition of D-RDFS with D including a new custom datatype. > In RDF 2004, it was possible to associate the the same IRI > to one datatype in one datatype map D1 and to a different > datatype in another datatype map D2. For example, the IRI > "ex:complex" may have been associated to a datatype > representing the mathematical field of complex numbers > in one extension of RDFS, and to a datatype representing > four-dimensional composites of real numbers for the > representation of space-time events in another extension > of RDFS. Under the RDF 1.1 Semantics, which requires the > existence of a globally unique IRI-datatype association, > this will not be possible anymore (regardless what the > globally unique IRI-datatype association will look like, > which is, as cited above, not fully determined by the > RDF 1.1 standard). > > In addition, some of the semantic conditions related to > the semantics of datatype have been adjusted in order to > reflect the change mentioned above on a technical level. > In general, the semantic conditions now refer to applications > of a given interpretation I to a datatype IRI aaa, "I(aaa)", > instead of referring to the associated datatype by its > reference given in the datatype map, as was done in RDF 2004. > For example, compare the second of the Semantic conditions > for datatype literals in Chap. 7 of [02] with the third of > the General semantic conditions for datatypes in Chap. 5 > of [01]. > > To summarize, the whole change here includes: > > * a change in nomenclature: > ("datatype map" vs. "set of recognized datatype IRIs"); > > * a change in the formal representation of the objects > under consideration: a set of IRI-datatype pairs vs. > a set of IRIs only plus an additional globally unique > IRI-datatype association, together with adjustments > to the semantic conditions for datatype semantics; > > * a change to the scope of uniqueness of IRI-datatype > associations: this scope has been local to every > particular datatype map in RDF 2004 while being global, > and by intention mostly undetermined, in RDF 1.1. > > > == A Non-Editorial Change == > > It has been argued by the Working Group that the change is of > a purely editorial nature. I would certainly not formally > object to an editorial change, but consider this a non-editorial > change. An editorial change would not change basic nomenclature > or formal or technical aspects of a specification, and all this > is the case here. > > Firstly, as stated above, the change introduced a change in > nomenclature from the notion of a "datatype map" to the notion > of a "set of recognized IRIs". Secondly, there have been some > changes to the underlying formal representation, as listed above. > Thirdly, and most notably in my opinion, the change of scope of > uniqueness of IRI-datatype associations has changed. This change > does have measurable effects, as I have already pointed out > by my example above where the same IRI "ex:complex" is used > for different datatypes: this is clearly possible in RDF 2004, > but will not be possible anymore in RDF 1.1. > > Another way of looking at the question whether a change to a > specification is editorial or not is to check whether existing > dependent work, such as other specifications, scientific papers, > or text books, would need to be updated in non-trivial ways > in order to be in line again with the changed specification. > For the change here, it becomes clear that dependent work > needs to be updated concerning the same things that have > changed in the RDF specification. For example, a text book > that is of a more formal nature would probably need to change > its used nomenclature from "datatype maps" to "sets of > recognized IRIs", its basic definitions from sets of pairs > IRI-datatype pairs to sets of IRIs, and would need to reflect > the change in the scope of uniqueness of the IRI-datatype > associations. > > Based on these arguments, I conclude that the change is > clearly non-editorial. > > > == Missing Motivation and Necessity for the Change == > > A non-editorial change in a specification requires good > motivation, and this is particularly true in the case of > RDF and its semantics, for which the charta of the > RDF 1.1 Working Group [06] explicitly requires that > "changing the fundamentals of the RDF Semantics" are > out of scope for the WG (Chapter 3). Based on my arguments > given below in the text concerning technical consequences > of the change, I consider the change to be indeed a change > of the fundamentals of the RDF semantics, and thus in > conflict with the charta. > > In general, the scope of the RDF WG was held deliberately > conservative. According to its charta, the scope was > "to extend RDF to include some of the features that the > community has identified as both desirable and important > for interoperability based on experience with the 2004 > version of the standard, but without having a negative > effect on existing deployment efforts." However, I am not > aware of any input from outside the working group during > the past 10 years since RDF 2004 became a recommendation > that would have asked for a change of the semantics > concerning the concept of datatype maps, or would have > indicated any problems with this concept. Rather, within > the previous years, at least three other core Semantic Web > standards have been written (OWL 2, SPARQL 1.1, and RIF), > which reuse the original notion of datatype maps without > any known problems, each taking years of specification > work and building up considerable experience with these > things. I am also not aware of any discussion concerning > problems with datatype maps from either the workshop or > the questionnaire that had preceeded the initiation of > the Working Group. > > As far as I am concerned myself, I have been responsible > for editing one of the mentioned dependent standards > (the OWL 2 RDF-Based Semantics), which makes heavy use of > the original definitions for datatype and datatype maps. > I have also provided some technical support (both in > private and public conversation) to the editors of > SPARQL 1.1 Entailment Regimes and RIF RDF&OWL Compatibility > with regard to the RDF semantics in general and to datatype > related semantics in particular. I have further created > several large test suites, which are to a large extent > about datatype semantics. I have created many formal proofs > based on the datatype semantics of RDF. I have spend some > time thinking about the implementation of datatype semantics, > although not yet implemented into my RDF Semantics reasoner > called Swertia. And overall I have been working in the > RDF field fulltime continuously for the last 8 years up to > the day. But in all these years with all this gained > experience concerning the RDF Semantics in general and > RDF datatype semantics in particular, I have never > encountered any serious problems with the original notion > of datatype maps. Rather, I have always found the original > datatype semantics well designed and it allowed me to do my > work decently. I would never have come to the conclusion that > anything would require a change, in particular not a change > of the kind proposed in RDF 1.1. > > In fact, from my earlier discussion with the Working Group > it became apparent to me that the change was not based on > input from the outside, as was requested by the charta, > but only from within the Working Group. In the context of the > charta, this would have only be acceptable, if there was a > strong reason, such as a so far unnoticed bug. The actual > rational of the Working Group was then to simplify the current > presentation of the RDF semantics [07]. Having given my arguments > above about the complete lack of request for a change and the > much work that has been carried out without problems based on > the original definitions, it should be clear that I do not see > any reason here for any form of simplification with regard > to the original situation. But of even more relevance is > that the changes have not really "simplified" the situation, > but have rather changed the situation and introduced > significant technical problems, as I will point out in > the following section. > > > == Technical Consequences of the Change == > > Probably the most notable technical aspect of the change > is that it is now assumed by the RDF 1.1 Semantics that there > exists a globally unique IRI-datatype association, which is > to be applied for each set D of recognized IRIs (as an > integral part of an interpretation I). In comparison, no > such unique IRI-datatype association was assumed in RDF 2004, > but the concept of datatype maps allowed to have different > datatype maps sharing the same IRI but associated with > different datatypes. Further, the RDF 1.1 Semantics PR > does not define this globally unique IRI-datatype association, > but considers its definition to be external to the semantics, > except for a small number of datatype IRIs from the XSD > namespace. This difference has a number of considerable > technical consequences. > > The first technical problem is that the change strongly reduces > the number of possible constellations of IRI-datatype associations: > In RDF 2004, for any set of IRIs i1,...,in there were, in principle, > infinitely many possible datatype maps D = { (i1,d1), ..., (in,dn) }. > In RDF 1.1, however, the associated datatypes d1,...,dn are uniquely > determined to be those from the globally unique IRI-datatype > association, which means that there is only a single such IRI-datatype > association for the given set of IRIs. > > An example for a possible practical consequence, which I have > already mentioned earlier, is that of two entailment regimes > sharing the same datatype IRI "ex:complex", but associated to > different datatypes, namely the mathematical field of complex > numbers on one hand, and a set of compounds of four real numbers > to represent space-time events. In general, it should be expected > that in certain fields custom datatypes will be developed and > used, without the need to wait for an international standardisation > of a IRI. The problem here is that if such a situation of concurrent > IRI-datatype associations occurs, at least one of the entailment > regimes will not be compliant with the RDF 1.1 standard anymore, > due to the fact that the RDF 1.1 standard demands that there is > a globally unique datatype associated for any given datatype IRI. > While this will hardly stop organisations from still developing > and using their custom datatypes, the situation is annoying and > undesirable, and it could trivially be avoided by sticking with > the original concept of datatype maps from RDF 2004. > > The second technical problem is that, as the RDF 1.1 Semantics PR > does neither provide nor ask for an explicit set of the globally > unique IRI-datatype association, the task of proving certain > semantic properties, such as the soundness and completeness > of reasoning algorithms or reasoning tools, may become problematic > or even impossible. For example, if we have some reasoner R that > accepts pairs of RDF graphs and outputs boolean values, > and we ask whether R is sound and complete with regard to D-RDFS, > for D including the datatype IRI "ex:complex", how can we proof > or disproof whether this semantic property holds for R or not? > As mentioned earlier, there may be more than one obvious datatypes > associated with "ex:complex", and unless we know the "right" one, > we simply cannot start proof work. > > This has not been a problem in RDF 2004, where the proof work > would have been done with regard to D-RDFS having an explicitly > defined datatype map D, which would have included a reference to > the datatype associated with "ex:complex". In fact, it would have > been possible to have D1-RDFS and D2-RDFS, both including the > IRI "ex:complex" but with different associated datatypes. R would > then, perhaps, have been sound and complete w.r.t. D1-RDFS but not > w.r.t. D2-RDFS, but, in any case, the proof work would have been > possible technically and its result would be been perfectly > determined. > > The third technical problem is that the assumption of the existence > of a globally unique, but completely open to an externally provided > definition, set of IRI-datatype association breaks, strictly > speeking, or at leasts "confuses" the RDF Semantics. As there are > no further limitations on the set of IRIs for which there can > be associated datatypes, there may be a datatype for > /every possible/ IRI, including every IRI defined for other > purposes by the RDF Semantics itself or elsewhere in the > Semantic Web. Hence, for any given D interpretation I and > any given IRI aaa, there exists some datatype d such that > I(aaa) = d. This horrible semantic concequence was certainly > not intended by the Working Group, but it is a consequence of > missing restrictions on the set of IRIs allowed to act as > datatype IRIs. However, I cannot imagine any meaningful constraint > on the names of datatype IRIs, so this problem will hardly be > eliminated by adding whatever constraint. Again, this problem > has not existed in RDF 2004, since there has not been such an > assumption about a globally unique but indetermined IRI-datatype > association. > > > == Consequences for dependent Semantic Web Standards and other Work == > > For existing Semantic Web standards that depend on the > RDF semantics and specifically on the original notion > of datatype maps, the change will mean that these standards > are not fully aligned anymore with the new version of RDF. > The most important standards that are directly affected > in this way are: > > * OWL 2, specifically the OWL 2 RDF-Based Semantics, > which is a conservative semantic extension of > RDF 2004 D-entailment and makes strong use of the > original datatype semantics; > > * SPARQL 1.1, specifically the RDF 1.1 Entailment Regimes, > which defines query results for querying on top of the > different RDF 2004 entailment regimes, including D entailment > and the also affected OWL 2 RDF-Based Semantics; > > * RIF, specifically the RIF RDF and OWL Compatibility spec, > which defines RIF-X combinations, for X being any of the > entailment regimes defined by the RDF 2004 Semantics > and also the affected OWL 2 RDF-Based Semantics. > > Notwithstanding the question whether the change leads to relevant > technical consequences, there will at least be a mismatch in > nomenclature, concepts, and formal representation. In fact, all > listed standards above explicitly refer to the definition of > datatype maps and use them for their own purpose. > > For example, the OWL 2 RDF-Based Semantics, following the > definitions of OWL 2 in general, introduces a specific > minimal datatype map consisting of a required set of > IRI-datatype associations, which even include several new > datatypes that have been introduced for specifically for > OWL 2 (and in part for RIF). The OWL 2 RDF-Based Semantics > considers any reasoner that fully supports /at least/ > these IRI-datatype associations as a compliant > OWL 2 RDF-Based reasoner, and allows such a reasoner > to support /arbitrary/ additional IRI-datatype associations; > which is, strictly speaking, in conflict with the idea > of a globally unique set of IRI-datatype associations. > > In general, I do not consider the change here to be of a sort > that would easily and naturally be implemented in future versions > of these dependent standards. It is by far not an obvious change, > or even only a "simplification" of the original situation. > Rather, it affects several aspects such as basic nomenclature, > formal representation, and even semantic assumptions about the > form of the interpretation functions. I am even unsure whether > all future working groups for these dependent standards will > be willing to adopt the change made to RDF 1.1, as this would > probably bring little value for these other standard beyond > formal compliance with RDF 1.1, but to the expense of possibly > breaking backwards compatiblity with the original version of > this other standard, as in the case of the OWL 2 RDF-Based > Semantics. So we may eventually find ourselves in a situation, > where some of the Semantic Web standards will follow the change > taken in RDF, while other's won't. This would, of course, > be a highly unfortunate and embarrassing situation, in > particular as the situation would be perfectly easily avoided > by simply avoiding the applied change to RDF in the first place. > > Similar consequences as for dependent standards are to be expected > for other existing work depending on or building on top of RDF, > such as text books on RDF or other semantic technologies, > university courses, research papers, software, etc. > > > == Conclusions and Proposal == > > I have argued that the current change is a non-editorial change > that leads to certain incompatibilities with RDF 2004 > and generally to undesirable consequences, such as > that it restricts the flexibility of defining custom entailment > regimes, a potential lack of well-definedness in questions > such as about soundness and completeness for reasoning algorithms > and tools, and even a technically flawed semantics by implicitly > requiring any existing IRI to be interpreted as some datatype. > This may have practical consequences for the application of > the RDF standard, and may lead to issues for existing other > Semantic Web standards, up to the danger of breaking compatibility > with earlier versions of these standards, if adopted, > or alternatively to a split situation, where some future versions > of these standards will not adopt the change made to RDF. > > I have further noted that none of these problems existed > for the original definition of datatype maps, and that no > other technical problems of datatype maps have been > brought up ever since from the outside to the RDF WG, > as originally required by the WG charta, although the > RDF specification, and particularly the notion of datatype > maps, has been in heavy use for a decade. In fact, the > rational for the change was essentially to only simplify > the original situation without any technical change. > As I have argued, the change /is/ technical, and has > considerable problematic consequences, while there was no > known request in the past even for simplification - a > point that I can well confirm as someone who has worked > a lot with the definition of datatype maps in the past, > including specification work, the creation of test suites, > and formal proof work. > > I therefore propose to fully revert the change to the original > notion of datatype maps to the form as it appears in the > original RDF specification as of 2004. This will be a valid > operation since, as I have argued, there was nothing really > wrong with the original definitions. It will also be a > preferable operation, since existing Semantic Web standards > and other published documents will continue to be compatible > with RDF 1.1, and their future authors will not be forced > into a decision whether to follow the change in the RDF semantics, > or to stick with the old definitions, where either choice may > be leading to certain compatibility issues. > > I expect that such a revert will be technically and editorially > easy, as the change is, fortunately, not very strongly entangled > with other parts of the specification, and the changes to the > semantics of datatypes are pretty straightforward. > > However, I do not suggest to completely abondon the idea of the > change. As there has been much discussion on the topic within > the Working Group but essentially none outside of it, neither > before the WG has started nor during its active time, I consider > it purposeful to put the change to the list of postponed issues > to be treated by a future RDF working group. By this, the proposed > change gets the chance to become known and discussed outside the > Working Group, and in particular by future working groups of > other standards that are based on the RDF Semantics. I believe > that, given the lack of request from outside the Working Group, > there is certainly no urge of applying this change to RDF now. > > > == About the Author == > > I have been the editor of the W3C OWL 2 RDF-Based Semantics > specification, and have been a contributor for several of > the other core OWL 2 specification documents, including the > OWL 2 Mapping to RDF and the OWL 2 RL/RDF Rules profile. > I have contributed part of the W3C OWL 2 test suite with > a focus on RDF-based reasoning, and have also created a > much larger version of this and several other test suites > concerning RDF semantics-based reasoning (some of them yet > to be published). I have provided, in both private and public > conversation, support to the editors of the SPARQL 1.1 > Entailment Regimes and the RIF RDF and OWL Compatibility > specification on topics concerned with the RDF Semantics. > I have worked in several international projects with strong > focus on semantic technologies, specifically RDF. I am also > working on a RDF reasoning system, called Swertia, > and have provided input to the RDF 1.1 Semantics CfI > based on this system. > > I am currently employed by the Derivo GmbH, Germany, > which is a small company specialized in products and > services based on semantic technologies. Since May 2013, > I have been permanently working for our business partner SAP, > doing work entirely dedicated to semantic technologies, > particularly RDF, SPARQL, and OWL. I am also currently a > guest scientist at FZI Research Center for Technologies, > Germany, where I have been working in the past for more > than five years, and a doctorand at the Karlsruhe Institute > of Technology (KIT), working specifically on reasoning in > expressive extensions of the RDF Semantics. > > == References == > > [01] RDF 2004 Semantics <http://www.w3.org/TR/2004/REC-rdf-mt-20040210/> > [02] RDF 1.1 Semantics PR <http://www.w3.org/TR/2014/PR-rdf11-mt-20140109/> > [03] LCWD comment on ISSUE 165 <http://lists.w3.org/Archives/Public/public-rdf-wg/2013Oct/0221.html> > [04] CR comment on ISSUE 165 <http://lists.w3.org/Archives/Public/public-rdf-comments/2013Dec/0027.html> > [05] Resolution of ISSUE 165 <http://lists.w3.org/Archives/Public/public-rdf-comments/2013Dec/0107.html> > [06] RDF WG Charter <http://www.w3.org/2011/01/rdf-wg-charter> > [07] <http://lists.w3.org/Archives/Public/public-rdf-comments/2013Oct/0083.html> > ---- Ivan Herman, W3C Digital Publishing Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 GPG: 0x343F1A3D FOAF: http://www.ivan-herman.net/foaf
Received on Tuesday, 25 February 2014 15:26:38 UTC