A proposal for imports in OWL 1.1

Hello,

At F2F I was assigned the ACTION-41 -- that is, I was supposed to propose a framework for imports in OWL 1.1. Well, here it is.
Bijan, I believe you were supposed to put this somewhere on the Web page, right?



Solution summary
----------------

If an ontology O imports an ontology O', then importation should be *by ontology URI*; the ontology O should contain the URI of the
ontology O' and this can be different from the physical location of O'. Furthermore, the OWL 1.1 specification should provide common
ways of resolving ontology URIs to physical URIs.


(Nomenclature convention: In OWL 1.1 Structural Specification, the term "ontology URI" is used to mean "ontology name". Hence, I
shall use the term "ontology URI" in the rest of this e-mail.)


Why not imports by location?
----------------------------

At the F2F, some people suggested that imports should be *by location*: if O imports O', then O should contain the location of O'.
Furthermore, there was some discussion about whether the location of the imported ontology should be the same as the ontology URI of
the imported ontology.

I believe that such a system is not particularly suited to typical scenarios in which OWL is used. I briefly list some of the
problems that commonly arise from such a definition.

1. Whereas finished ontologies may indeed be published on the Web at a location that is identical to the ontology URI, in order for
someone to use the ontology, the ontology has to be copied locally. This invariably makes the physical ontology different from its
ontology URI.

2. Some people argue that the problems described under 1 can be thought of as caching. I agree to this view, for the ontologies that
originally exist somewhere on the Web and are copied locally for reasoning. However, ontologies will often be developed locally, and
will be published to the Web only after they are finished. While the ontology is being edited, its ontology URI is unlikely to be
the same as its physical URI.

3. There is a more general question whether the ontology URI and the physical location need to be the same. Imagine a user that is
starting Protégé and clicking on "New Ontology". Protégé might ask then the user to choose a directory where the ontology is to be
stored. If the ontology URI is to be equal to the physical location, then the ontology should be assigned a URI such as
file:/C:/Temp/ontology.owl. It is natural to use the ontology URI to generate URIs of ontology entities; hence, as the user adds
entities to the ontology, these entities will be called, e.g., file:/C:/Temp/ontology.owl#Person. This is undesirable.

4. People often move files on their computer. If O were to import O' by location, then moving the ontology files breaks the imports.
You then need to open the ontology in a text editor to fix the problem. Furthermore, if the ontology URI must be the same as the
physical one, then moving the ontology on your computer breaks the validity of an ontology, unless you rename the ontology
accordingly.

5. Consider ontology repositories, such as Swoogle: there, the ontology URI is again unlikely to be the same as the physical
location of an ontology.


To summarize, the physical URI and the ontology URI are different for most of the time when people are actually working with their
ontologies. Therefore, the OWL 1.1 specification should take this into account and provide some direction and guidance to
implementations about how to handle these situations correctly. I'm fine with viewing this as caching; however, I then believe we
should standardize some common caching mechanisms across tools.

I would also like to mention that XML Schema -- a Web and W3C standard -- has correctly identified this problem, so the XML Schema
standard takes these distinctions into account. Some people suggested at the F2F that this holds for WSDL (I don't know myself the
details of WSDL so I can just repeat here what others have said).




The solution in more detail
---------------------------

If an ontology O imports an ontology O', then O should contain the ontology URI of O'. Furthermore, at any given point in time, the
ontology and physical URIs of either of these two ontologies are allowed to be different. Each OWL 1.1 implementation is required to
provide an oracle for mapping ontology to physical URIs.

This is roughly what the current OWL 1.1 draft says. I do agree, however, that this solution has an important drawback: there is no
oracle that would be generally supported across implementations. This clearly hampers interoperability. Hence, in the rest of this
e-mail, I shall describe a couple of oracles that might be made normative.


Oracle 1: File-based resolver
+++++++++++++++++++++++++++++

We could require each implementation to support at least the default oracle, which takes as input a file containing pairs of
ontology and physical URIs. This file should have a trivial textual format, such as

<ontology URI><TAB><physical URI><CR/LF>

Alternatively, we might create a simple ontology instances of which would provide (ontology URI => physical URI) mappings.



Oracle 2: Physical location hints a la XML Schema
+++++++++++++++++++++++++++++++++++++++++++++++++

In XML schema, imports are by schema name; however, in order to aid an implementation in locating the imported schema, the importing
schema can contain location hints. For example, you can write in the importing schema

<import namespace="http://www.w3.org/1999/xhtml" schemaLocation="file:/c:/Temp/schema.xsd"/>


We might provide a solution that works in similar vein. We could change the OWL 1.1 structural specification and say that an
ontology contains zero or more *importation records*, each of which consists of exactly one ontology URI and a list of zero or more
physical URIs. Such records could be serialized into RDF as follows:

<O owl11:importationRecord _:x>
<_:x owl11:ontologyURI "the ontology URI of the imported ontology">
<_:x owl11:physicalURI "one of the physical URI hints">    <= repeated for each physical URI



The final ingredient: URI resolution strategy
---------------------------------------------

Assume that an ontology URI O' is imported in an ontology O, and that we need to locate O' physically. An implementation should then
do the following:

1. An implementation should first try each physical URI specified in the importation record for O' in O. The physical URIs should be
tried in the order specified in the importation record. As soon as an ontology is found at some physical URI, the algorithm
terminates.

2. An implementation should try application-specific oracles for resolving the ontology to the physical URI. An application is
required to support at least the file-based oracle.

3. If all this fails, the implementation should look for O' at the physical URI equal to O' -- that is, if everything else fails,
the application should assume that the ontology URI is equal to the physical URI.

Regardless of how the ontology is found, the ontology URI specified in the ontology file should be exactly the same to O' -- that
is, it should be illegal to resolve O' to some ontology file which contains some other ontology URI.



Finishing notes
---------------

Note that condition 3 above essentially allows you to have your cake and eat it: if the ontology and the physical URIs are the same,
then imports are "by name". I believe, however, that it is important to alert OWL 1.1 implementors to the fact that the ontology and
the physical URIs are usually not the same and that they have to think how to handle this in practice. The main drawbacks of OWL 1.0
implementations arose due to the fact that the specification did not contain any information about this issue, so some developers
just didn't think about it.


Regards,

	Boris

Received on Saturday, 15 December 2007 17:57:01 UTC