- From: Boris Motik <bmotik@cs.man.ac.uk>
- Date: Sun, 5 Aug 2007 10:12:06 +0200
- To: "'Alan Ruttenberg'" <alanruttenberg@gmail.com>
- Cc: <public-owl-dev@w3.org>
Hello, I probably shouldn't have said "checking for typos" - this was misleading. The point is that you somehow need to add a class, a property, etc. to an ontology. Declarations do exactly that: they allow you to say e.g. "the class C is a part of the ontology O1" without having to say anything else about C. You need this kind of a feature even if you decide to edit your ontologies using a visual editor. When you select the "create a new class" from a menu in the editor, you'll be prompted for a class name, and then you will have to add the class to the ontology. How do you do that? Well, you add the declaration for the entered name. Without explicitly being able to add classes to an ontology, you can't implement the "create a new class" function in the editor. Thus, declarations are needed even if you just use ontology editors to edit an ontology. You might think that all this is unnecessary, because it already worked in OWL 1.0; after all, we do have editors for OWL 1.0 that work properly. In OWL 1.0, all of these existed, but was never made explicit. Most APIs allowed you to add a class to an ontology, but it was not clear what they are supposed to do at the RDF level. Therefore, in OWL 1.1 we separated the conceptual level from the RDF serialization. Now at the conceptual level (i.e., at the level of the definition of OWL 1.1 using its structural specification), it seems to me that it is unequivocal that you need declarations: the "add class to an ontology" method implemented in most OWL 1.0 APIs did exactly that. Furthermore, you need a conceptual notion of a declaration even if you use editors. Before proceeding to the issues of serializing the structural specification into RDF, let me juts say that the structural integrity check is quite useful. Perhaps there was no a public cry for this particular feature, but this is because the users often do not know what to complain about. Throughout the years, I have often received a number of related complaints. People would often send me an ontology and would said that KAON2 has a bug because it did not produce some expected inference on their ontology. After investigation, I would see that some axioms in their ontology referred to completely wrong classes. Perhaps this was not due to typos, but it was quite often due to URI resolution. URI resolution in RDF is quite brittle (there are namespaces, XML base, and ontology URIs, and people get confused by this), so people would think that their axiom refers to one URI, but in reality it referred to a completely different URI. This is particularly true in case of imports: many people find it quite difficult to manage the URIs and URI resolution properly in such cases. In fact, I have even seen tools that spit out wrong RDF (i.e., RDF where, if you applied the RDF specification correctly, the URIs would get resolved differently from what the users and tool builders thought would happen). If we had an explicit declaration feature, we could detect all these errors with a press of a button. URI mismatches are not exactly typos, but are similar to typos: you use a class with a different URI than what is intended. In order to simplify the discussion, I called them typos (which I probably shouldn't have). To summarize, I believe that we need declarations, for at least the following two reasons: - We need a well-defined way to say that an entity is a part of an ontology. - If we know which entities are intended to belong to an ontology, then we might as well check whether all axioms reference the correct entities. In combination with imports, this can be a really useful feature. Please note also that both of these two features are optional: you don't have to declare anything, and you don't need to check the structural consistency of anything. Furthermore, even if the structural consistency of an ontology fails, you can still use the ontology as if declarations were not there. Hence, I really do not see why this would be such a hard feature to swallow. The discussion about declarations in the structural specification of OWL 1.1 should be considered independently from the discussion how declarations are to be encoded into RDF. We should not conflate the two, because the encoding issue is subordinate to the issue whether we need declarations or not. In the RDF encoding, we have the problem that (1) we need to correctly decode the declaredness status of some entity, and (2) we need the appropriate typing information to be able to construct the correct axioms. Conflating the two seems like a really bad idea, because this means that, whenever you see a triple of the form <C, rdf:type, owl:Class>, you don't know whether this triple just provides you with typing information or whether this also declares a class. This is particularly problematic in combination with imports. Finally, I understand that many people do not have sympathy with parser writers. The problem is, however, that most people will in the end use some parser, and parser writers have difficulties in implementing the specifications, the users will be stuck with parser errors. Already with OWL 1.0 it was not always the case that you could simply load a piece of OWL generated by one tool into another tool. This was often due to the fact that the two tools had different ideas about which typing triples have to be present in an ontology for the ontology be parsable. One potential problem is caused by the fact that OWL has syntaxes other than RDF. The DIG people want to use the XML-based syntax of OWL. As seen on previous OWLED workshops, many people want to have syntaxes that are easier to read, more natural-language like, etc. Now given the multitude of syntaxes, it does make sense to make each file parsable alone. Let me define this more precisely: Given a file F, you should be able to reconstruct the OWL 1.1 structural axioms belonging to F by just looking at the contents of F, and not at the contents of any other files. Here is why this is (strongly) desired. Imagine you have an ontology O1 in an ontology O1 in RDF/XML, an ontology O2 in the XML-based syntax, an ontology O3 in the functional-style syntax, and an ontology O4 in a proprietary syntax such as SOF (this format was proposed at this year's OWLED). Imagine now that O1 imports O2, O2 imports O3, and O3 imports O4. If you can't parse each file by itself, then parsing all this (i.e., reconstructing OWL 1.1 structural axioms) is a complete nightmare: you need to orchestrate interaction between four different types of parsers. This is going to be quite difficult to get right for anyone. Since I actually did have users wanting to use KAON2 with different syntaxes, and I did have to struggle with the issues of an ontology in one syntax importing an ontology in another syntax, I thought that fixing this problem would be for the common good. And doing so is not difficult: we just need to have sufficient typing information included in each ontology. This means that, by looking at the triples in the ontology O1 from the above example, I should be able to find out the type of each URI contained in it; I should not look into O2, O3, and so on to disambiguate the type. But if I add rdf:type statements to each ontology, then I can't use rdf:type for declarations. Finally, most users will never see the declarations anyway if they use a visual editor to edit an ontology. The editor would work like this: - When someone selects "create new class", then the editor would add a declaration axiom for the class to the ontology. - When the ontology is being saved to a file, the editor would write our all proper typing information, and would also write out declarations separately. Hence, the users would not be overloaded with this. The only difference they would see is that they would have fewer broken ontologies. Sincerely yours, Boris Motik
Received on Sunday, 5 August 2007 08:12:56 UTC