TopQuadrant Response to the OWL2 Last Call of 02 December 2008

Summary

TopQuadrant believes that the current working drafts for the OWL2 specifications, would, if advanced to Recommendation, be detrimental to our business, and to our customers' use of the Semantic Web Recommendations.

Introduction

TopQuadrant is a small company offering both products and services. All of our revenues depend on the successful deployment of SemanticWeb technologies. We are profitable.

We have a range of concerns about the OWL2 specifications. Their technical merit is not one of them. Thus, our concerns are not largely suited for being Last Call comments on the design, but are better expressed as comments concerning the New Features and Rationale document. We make one formal procedural comment against the last call, merely to connect this response to the last call, and the W3C process.

The main thrust of our concerns is that we find the motivations for changes to the OWL Recommendation to be weak or non-existent, and to be limited in their scope to what we believe to be a narrow section of the Semantic Web marketplace. Todo: rephrase Thus, while a few consortium members may benefit from the new specifications, TopQuadrant, and we believe several other, consortium members engaged in the Semantic Web, and also the wider public using the Semantic Web Recommendations would be done a disservice if the consortium were to advance them to Recommendation.

TopQuadrant is committed to the consensus process of the consortium, and will not press these concerns without support from other members. However, anecdotal evidence, e.g. here (todo: OWL 2 Far) and here (todo: SWIG split msg), indicate that others in the industry share our perspective.

Last Call Comment

TopQuadrant's single last call comment on all of the technical OWL2 specification documents is a procedural comment.

We are disappointed that the Working Group delivered the first working draft of the requirements document listed in the WG charter, at the same time as the last call working drafts of the technical specification, rather than sufficiently before to allow our comments to appropriately influence the on-going design work of the WG. Hence, we ask the WG to formally address our comments on the New Features and Rationale working draft as part of the last call process for the technical specifications.

Comments on New Features and Rationale working draft

These are comments on: New Features and Rationale of 02 December 2008.

Main comment: OWL Too?

The rationale document (and the design) has not taken into account the cost of new features particularly to those who do not need them. These costs include: implementation costs, training costs, documentation costs, and simply the cost of ignoring something. (It needs to be understood before it can be ignored). This is seen anecdotally in the ease with which people slip into thinking: "OWL Too Far", "OWL Too Full", "OWL Too Much".

An example use case illustrating such costs is as follows:

Izzie, Joseph, Kevin, Lucy and Makato are building a set of ontologies and semantically enhanced applications in the area of dietary planning. They are using a collection of tools each of which supports some of the semantic web recommendations. They have some informations sources that they have prepared in-house, they are also integrating several Web-based information sources, most (Diet.example.org, Energy.example.org, Food.example.org) of which are available in RDF; some (Diet.example.org, Energy.example.org) of these use OWL1 features, and are available in RDF/XML, and one (Comestibles.example.org) of which is only available in the Manchester OWL Syntax, and one of which (Beverages.example.org) is only available in application/owl+xml. Both of these last two use OWL2 features, but have some useful information in their OWL1 subset.
Izzie is the team lead and has a good understanding of the full range of the Semantic Web recommendations. Kevin is a modelling expert, with a background in knowledge representation, Lucy understands inference well, Makato is a SPARQL wizard. Joseph is fresh out of school. (The economy has picked up, and Izzie was slighty disappointed with the limited choice of candidates for the Junior Ontologist Trainee role).
Izzie makes the design decision that OWL2 features are not needed for this project, and that the advantages of using several OWL or RDF applications that do not support OWL2 features is the critical factor. She sends Joseph on a Semantic Web training course, while the rest of the team get started on the project.
Most of their tools cannot read the manchester syntax or OWL/XML, so they use a Jena extension to convert Beverages.example.org and Comestible.example.org to RDF/XML; and they keep the copies locally.
They experience a variety of problems related to OWL2, including:
  1. Various OWL2 constructs (from Diet.example.org, Energy.example.org) are meaningless in OWL1, and the OWL1 implications are lost.
  2. Joseph's training was 1-day RDF, 1-day OWL1, 1-day SPARQL, half-day RIF, half-day OWL2. He got somewhat confused! He sees the OWL2 constructs being used in some of the data sources, and when mapping some proprietary data into OWL he uses the same constructs. Despite the fact that he used the OWL2 constructs correctly, the overall system does not take his modelling into account, since several of the components are not OWL2 aware. Izzie has to take him to one side, and spell out the decision to use OWL1 only and what that means.
  3. One of their tools do understand the manchester syntax and OWL/XML, this ends up going to the web and retrieving the latest version of Beverages.example.org and Comestible.example.org. A couple of months into the project, these ontologies are updated on the web, and the local copies (in RDF/XML) are not. A few days later, unexpected system behaviour is observed, which is eventually tracked down to the version mismatch.
  4. A week or two after this, a team meeting gets completely derailed when Kevin pipes up in advocacy of supporting various OWL2 features because he wants the extra expressive power; Lucy gets quite irate at having to explain, all over again, her estimates of the runtime cost in the inference engines for the additional expressive power; while Makato tries to drill down with Kevin on precisely what is wrong with the SPARQL queries that the team have been using to fill in gaps where the expressive power of OWL1 is less than is required for their application. After a fairly unproductive 90 minutes, Izzie calls time, and reminds the team of the earlier design decision to use OWL1. Joseph has a headache.

While these problems are largely to be expected in an ontology development project, and can be addressed by a range of techniques such as improved project management, cache control, version management, and aspirins, we believe that OWL2 introduces many additional places where such problems might arise, and we do not see adequate consideration of these costs in the design.

We ask that many under-motivated new features should be dropped, including all unmotivated new features.

An alternative, possibly better approach to addressing this comment, might be to rebrand most, if not all, of the new features of OWL2, as "Web-SROIQ", and put them in a separate namespace, not branded as OWL, so that the (vast) majority of Semantic Web users for whom these features are neither useful nor helpful, but merely confusing, can rest more easily in ignoring them. Notice the choice of name for the rebranding does not include the string "OWL", and reflects the real motivations for the new features (i.e. the academics have worked out some interesting extensions to SHOIN).

Other comments

Danger of bias
We believe that at all stages in the development of the OWL2 specification that the interests of the DL community, such as tableau reasoner vendors, and that part of the academic community concerned with tableau reasoner design, has been over-represented, and the interests of the wider semantic web community (particularly RDF users) as been under-represented. We trust that during periods of public review, and during the call for implementations, the WG will be mindful of the need to see active support from the wider semantic web community, and not be satisfied with passive acceptance.
RDF interoperability
Since almost all TopQuadrant's business uses both RDF and OWL together the implicit requirement in OWL 1.0 that OWL and RDF should work well together, remains a critical requirement for OWL2. We do not see this listed as a requirement, and believe that several of the new features added are in practice in conflict with this requirement.
effective?
In the abstract we are highly unconvinced by the scoping to effective reasoning algorithms. To the expert reader this appears to be a highly technical sense of the word effective, which is likely to confuse the more general reader. Our customers are interested in software that returns results in reasonable time, typically within a few seconds, but maybe with some offline computation of a few hours, effective algorithms can take an unbounded length of time, and ineffective algorithms can be quick in most cases of interest. Thus the scoping to effective algorithms is simply a religious allegiance and has no business sense.
OWLED
In the overview, a key part of the rationale is expressed as: as part of the OWLED Workshop Series. This is slightly disingenuous. In fact, it was only the first two meetings (Nov 2005, Nov 2006) that played into the member submission, and had significant impact on the design, we believe, perhaps incorrectly, that the ordering "real applications, user and tool developer experience" is misleading, and that the drivers were more from the tool developers with users and applications being a post hoc rationale for new features, rather than true motivations. We note that the attendance and presenters at these meetings seem to, from our perspective, under-represent the many OWL users, who use mainly RDF with a little bit of OWL etc.
manchester syntax
In section 2 Features & Rationale, the manchester syntax is not mentioned or justified as a new feature. Since this introduces additional costs, and is apparently unmotivated, we suggest it is dropped.
OWL/XML
In section 2 Features & Rationale, the OWL/XML syntax is not mentioned or justified as a new feature. Since this introduces additional costs, and is apparently unmotivated, we suggest it is dropped.
Links to Wiki should be links to TRs
The syntax and semantic links for features discussed in section 2, should link to the TRs and not to the Wiki.
Syntax examples should include RDF
The lack of RDF triples in the syntax examples hindered our review effort.
DisjointUnion subPropertyof UnionOf
The new features introduced as syntactic sugar introduce dangers of interoperability failure between OWL1 and OWL2. Some simple steps should be taken to reduce such risks, such as adding the assertion that owl:disjointUnionOf rdfs:subPropertyOf owl:unionOf.
DisjointUnion and DisjointClasses
Being syntactic sugar, these new primitives are strictly speaking unnecessary. There form in RDF triples is very different from the equivalent disjointWith statements and are significantly harder to process for OWL implementations that work natively over RDF, rather than by first translating into OWL axioms. It seems unlikely that many RDF based OWL implementations (OWL Full implementations) will correctly implement these constructs. Hence these constructs are likely to lead to interoperability failure between OWL Full and OWL DL systems. The costs of such failure are much higher than the costs of requiring users who need such constructs to use the somewhat funky styles required by OWL1. These features should be dropped.
Negative*PropertyAssertion
These features are highly problematic for RDF interoperability. While, technically, from an OWL DL implementation perspective, these are merely macros for membership of complements of hasValue restrictions, the promotion of this from an esoteric construct, to a first class axiom, changes their implicit status. RDF systems are simply not geared up to support negative triples as well as positive ones. The negative assertion of the reified triple, while technically faultless, is a practical disaster in terms of setting user expectations that RDF based OWL tool vendors are unlikely to meet. For the SemanticWeb community as a whole, interoperability between RDF and OWL is much more valuable than ease of use of an advanced OWL construct. The WG has made a bad design choice by including these.
SelfRestriction and the Schneider variant of the Patel-Schneider paradox
As the WG is well aware, the self-restriction increases expressive power in ways that introduced further paradoxes with OWL Full semantics. While these are addressed in the WG's technical work, the use cases motivating the new construct fall well short of what we would expect as needed for motivating risky theoretical changes.
QCRs
We believe these to be a useful addition to OWL. Several TopQuadrant customers would use this feature.
reflexive, irreflexive, asymmetric and disjoint properties
Our general comment concerning failure to consider the cost of change applies to these features.
Property chain inclusion axioms
These appear to be quite widely useful. We have some concerns with the use of blank nodes in the subPropertyOf triple corresponding to a RIA. These are likely to cause problems for RDF implementations which expect all predicates to be URI nodes. We think that drilling down and fixing all instances where RDF software makes this assumption is costly and unlikely to happen and to introdcue incompatibaility between OWL2 and RDF. We believe introducing a new property in the RDF mapping for RIA and avoiding the use of subPropertyOf is probably a better trade off here.
EasyKeys
explicitly no opinion
unary datatype
Many TopQuadrant customers require these features. These make sense as part of the main SemanticWeb Recommendations
N-ary datatype
This feature appears to have been dropped, and to be in this document by accident, if not we would like to comment.
Punning
We have mixed opinions ... and make no formal comment, but we are uneasy with this change.
Annotations
In general, improvements in the expressiveness of the OWL1 annotation system are to be welcomed. However, the detail of the solution in OWL2, is worrying. The use of reification for mapping some of the axioms is suspect. The change form RDF reification to OWL2 reification is unmotivated. The underlying problems with reification are not addressed by renaming. A practical worry for a Semantic Web editor (supporting both RDF and OWL) as opposed to an OWL editor or an RDF editor, is that maintaining the link between the reified triple and the triple itself is another indirection point that must be considered in many places. Thus, with the implementation of extended annotations embedded within the current OWL2 design will likely lead to further bifurcation of OWL from RDF, with people either using OWL tools (that support such features) or RDF tools (that do not, and are likely to interoperate poorly with the OWL tools, e.g. by renaming a triple without considering the impact on the reified triple). In some of our consultancy engagements we have been applying annotations to any convenient blank node, e.g. the blank node of a restriction: we believe this is a better compromise between the needs to annotate ontologies and the need for RDF interop.
Profiles
We are supportive
Appendix
We are concerned about the relative importance of HCLS as an application domain in motivating the design of OWL2. TODO expand, and report on TQ's HCLS customer reqs.
Tables - dependency on HCLS
Without the HCLS use cases, (UC#1, UC#2, UC#3, UC#5, UC#, UC#8, UC#9), the first table seems much closer to our experience of the problems with OWL1 - not a huge number, not a great need for new constructs. It seems mistaken to make widespread change to a generic technology in order to satisfy specific requirements from one application domain.
UC#11, UC#10
Presumably are legacy and need deleting.

Comments on Manchester Syntax

Document reviewed: Manchester Syntax

Typo: TopBraid Composer
Please capitalize correctly on all uses (TBC all in caps).
Trademark
please acknowledge our trade mark in TopBraid Composer
Error in Appendix
TopBraid Composer does not, and does not intend, to support Manchester Syntax as an I/O format, and so should not be referenced in the IETF application. The earlier statement correctly expresses the intent: TopBraid Composer uses Manchester Syntax for displaying and entering descriptions associated with classes.

Comments on datatypes in Structural Specification

These comments are on Structural Specification and Functional-Style Syntax; and concern the datatype mapping section, and the features at risk.

Lack of rationale and motivation
The changes in the approach to XSD datatypes (such as introduction of owl:real, and the redefinition of the value spaces of the XSD built in datatypes) is not motivated in the Rationale document. Since this change risks interoperability failure this seems like a significant oversight.
owl:real
A possible motivation for owl:real is to allow a property which can be used with any numeric datatype. TopQuadrant customers have such use cases, when merging data from various sources: however, the use of a simple XSD union datatype is an alternative solution, which we prefer.
owl:real
A possible motivation for use of owl:real is to permit integration of numeric reasoning services in with ontological reasoning services. While this may be useful for some Semantic Web application, we do not find this to be useful for our business. We do find it critical that numbers in semantic web applications interoperate with numbers in databases, and with numbers in programming languages. We hence suspect that this proposed change to the semantics of datatypes in OWL is a further example of a clean theoretical solution that does not make practical business sense. We suggest that the value spaces of the XSD datatypes should remain unchanged from OWL1.
owl:rational
There is a typo (owl:datetime) in the description of this datatype. Otherwise, we explicitly have no comment.
rdf:XMLLiteral
While we have no comment at this time, please let us know if the WG is minded to drop support for this datatype; we may wish to speak up in its defence.

Carroll's Personal Comment on OWL 2 Full Semantics

Procedural point, I am listed as a contributor, but, I believe that the patent policy has not required me, HP or TopQuadrant to make a declaration (explicitly or implicitly) concerning the unencumbered status of that contribution. I am not personally aware of any patents connected with my contribution to these semantics, and I am happy to participate individually in the patent policy for this document; I believe that TopQuadrant would not have a problem in making a similar corporate statement. I can no longer speak for HP.