RE: declaredAs

Hello,

I probably shouldn't have said "checking for typos" - this was misleading.
The point is that you somehow need to add a class, a property, etc. to an
ontology. Declarations do exactly that: they allow you to say e.g. "the
class C is a part of the ontology O1" without having to say anything else
about C.

You need this kind of a feature even if you decide to edit your ontologies
using a visual editor. When you select the "create a new class" from a menu
in the editor, you'll be prompted for a class name, and then you will have
to add the class to the ontology. How do you do that? Well, you add the
declaration for the entered name. Without explicitly being able to add
classes to an ontology, you can't implement the "create a new class"
function in the editor. Thus, declarations are needed even if you just use
ontology editors to edit an ontology.

You might think that all this is unnecessary, because it already worked in
OWL 1.0; after all, we do have editors for OWL 1.0 that work properly. In
OWL 1.0, all of these existed, but was never made explicit. Most APIs
allowed you to add a class to an ontology, but it was not clear what they
are supposed to do at the RDF level. Therefore, in OWL 1.1 we separated the
conceptual level from the RDF serialization. Now at the conceptual level
(i.e., at the level of the definition of OWL 1.1 using its structural
specification), it seems to me that it is unequivocal that you need
declarations: the "add class to an ontology" method implemented in most OWL
1.0 APIs did exactly that. Furthermore, you need a conceptual notion of a
declaration even if you use editors.




Before proceeding to the issues of serializing the structural specification
into RDF, let me juts say that the structural integrity check is quite
useful. Perhaps there was no a public cry for this particular feature, but
this is because the users often do not know what to complain about.
Throughout the years, I have often received a number of related complaints.
People would often send me an ontology and would said that KAON2 has a bug
because it did not produce some expected inference on their ontology. After
investigation, I would see that some axioms in their ontology referred to
completely wrong classes. Perhaps this was not due to typos, but it was
quite often due to URI resolution. URI resolution in RDF is quite brittle
(there are namespaces, XML base, and ontology URIs, and people get confused
by this), so people would think that their axiom refers to one URI, but in
reality it referred to a completely different URI. This is particularly true
in case of imports: many people find it quite difficult to manage the URIs
and URI resolution properly in such cases. In fact, I have even seen tools
that spit out wrong RDF (i.e., RDF where, if you applied the RDF
specification correctly, the URIs would get resolved differently from what
the users and tool builders thought would happen).

If we had an explicit declaration feature, we could detect all these errors
with a press of a button. URI mismatches are not exactly typos, but are
similar to typos: you use a class with a different URI than what is
intended. In order to simplify the discussion, I called them typos (which I
probably shouldn't have).


To summarize, I believe that we need declarations, for at least the
following two reasons:

- We need a well-defined way to say that an entity is a part of an ontology.

- If we know which entities are intended to belong to an ontology, then we
might as well check whether all axioms reference the correct entities. In
combination with imports, this can be a really useful feature.


Please note also that both of these two features are optional: you don't
have to declare anything, and you don't need to check the structural
consistency of anything. Furthermore, even if the structural consistency of
an ontology fails, you can still use the ontology as if declarations were
not there. Hence, I really do not see why this would be such a hard feature
to swallow.






The discussion about declarations in the structural specification of OWL 1.1
should be considered independently from the discussion how declarations are
to be encoded into RDF. We should not conflate the two, because the encoding
issue is subordinate to the issue whether we need declarations or not.

In the RDF encoding, we have the problem that (1) we need to correctly
decode the declaredness status of some entity, and (2) we need the
appropriate typing information to be able to construct the correct axioms.
Conflating the two seems like a really bad idea, because this means that,
whenever you see a triple of the form <C, rdf:type, owl:Class>, you don't
know whether this triple just provides you with typing information or
whether this also declares a class. This is particularly problematic in
combination with imports.




Finally, I understand that many people do not have sympathy with parser
writers. The problem is, however, that most people will in the end use some
parser, and parser writers have difficulties in implementing the
specifications, the users will be stuck with parser errors. Already with OWL
1.0 it was not always the case that you could simply load a piece of OWL
generated by one tool into another tool. This was often due to the fact that
the two tools had different ideas about which typing triples have to be
present in an ontology for the ontology be parsable.

One potential problem is caused by the fact that OWL has syntaxes other than
RDF. The DIG people want to use the XML-based syntax of OWL. As seen on
previous OWLED workshops, many people want to have syntaxes that are easier
to read, more natural-language like, etc. Now given the multitude of
syntaxes, it does make sense to make each file parsable alone. Let me define
this more precisely:

  Given a file F, you should be able to reconstruct the OWL 1.1 structural
  axioms belonging to F by just looking at the contents of F, and not at
  the contents of any other files.

Here is why this is (strongly) desired. Imagine you have an ontology O1 in
an ontology O1 in RDF/XML, an ontology O2 in the XML-based syntax, an
ontology O3 in the functional-style syntax, and an ontology O4 in a
proprietary syntax such as SOF (this format was proposed at this year's
OWLED). Imagine now that O1 imports O2, O2 imports O3, and O3 imports O4. If
you can't parse each file by itself, then parsing all this (i.e.,
reconstructing OWL 1.1 structural axioms) is a complete nightmare: you need
to orchestrate interaction between four different types of parsers. This is
going to be quite difficult to get right for anyone.


Since I actually did have users wanting to use KAON2 with different
syntaxes, and I did have to struggle with the issues of an ontology in one
syntax importing an ontology in another syntax, I thought that fixing this
problem would be for the common good. And doing so is not difficult: we just
need to have sufficient typing information included in each ontology. This
means that, by looking at the triples in the ontology O1 from the above
example, I should be able to find out the type of each URI contained in it;
I should not look into O2, O3, and so on to disambiguate the type.

But if I add rdf:type statements to each ontology, then I can't use rdf:type
for declarations.



Finally, most users will never see the declarations anyway if they use a
visual editor to edit an ontology. The editor would work like this:

- When someone selects "create new class", then the editor would add a
declaration axiom for the class to the ontology.

- When the ontology is being saved to a file, the editor would write our all
proper typing information, and would also write out declarations separately.

Hence, the users would not be overloaded with this. The only difference they
would see is that they would have fewer broken ontologies.

Sincerely yours,

	Boris Motik

Received on Sunday, 5 August 2007 08:12:56 UTC