- From: Timothy Redmond <tredmond@stanford.edu>
- Date: Sat, 1 Dec 2007 15:46:50 -0800
- To: public-owl-dev@w3.org
- Cc: Chris Mungall <cjm@fruitfly.org>, Stephen Larson <slarson@ncmir.ucsd.edu>, Bill Bug <wbug@ncmir.ucsd.edu>
I hope that this is an appropriate and acceptable message. For some
time now some tool developers and users have been having increasing
difficulty with the owl import declaration. I have a proposal but I
would really like some advice on any misconceptions I might have,
whether this is good approach and how the OWL 1.1 standard will
address this issue.
The Problem:
Somewhere on the web, there is an ontology that contains the following
statements:
<?xml version="1.0"?>
<rdf:RDF
...
xml:base="http://purl.org/obo/owl/sequence">
...
<owl:Ontology rdf:about="">
According to my reading of the specifications, this means that the
ontology in question is called http://purl.org/obo/owl/sequence and
that the proper way to import this ontology is
<owl:Ontology rdf:about="">
<owl:imports rdf:resource="http://purl.org/obo/owl/sequence"/>
...
(please forgive faulty rdf...). I will call this method of writing
an import "import by name".
However, when I went to the web page given by http://purl.org/obo/owl/sequence
I get a redirect followed by a not found error. So anyone who
finds the importing ontology has no way of finding the imported
ontology. When a developer or user knows were the ontology is
located, they can use various mechanisms to tell their tool where to
find the ontology. But none of these mechanisms are compatible. In
addition, in many cases, ontology authors cannot control the
ontologies that they wish to import.
For these reasons, many ontology authors have taken to doing import by
location. Tool builders are finding themselves forced to
accommodate. In my opinion import by location (or some hybrid)
doesn't solve anything though as I argue below.
Use Cases:
1. The internet is trusted, available and reliable, ontologies are
never relocated and all the ontologies of interest are on the internet.
This is the one case where import by location shines and import by
name does very badly. With import by name, a person reading an
ontology off the web may not be able to determine where to find the
imported ontology.
I will lump the other use cases together but they may have important
differences.
2. I am commuting home from work with no internet access and unzip a
collection of owl files.
3. I am developing an application which may not have access to the
internet and/or may not be willing to trust the internet even if it
had access.
4. I have access to the internet but I want to edit some (must be more
than one) ontology that I download off the web.
5. Web servers, projects and organizations come and go and ontologies
are relocated.
In these cases, to varying degrees, import by name works very well
and this is why I think it is the right choice. (Cases 2 and 3 are
close to my heart.) Consider use case 2 because in some sense it is
an extreme. In this case - with import by name - I simply plop the
owl files on my disk and my tool can easily determine which ontologies
import which. It just needs to parse the import statements and the
ontology declarations from the files in question.
Import by location fares much worse in this case. My tool has no way
of figuring out which ontology imports which - it must be told. If the
zip file only contains owl files then it is a human who must figure
out the imports. Also import trees can be pretty complicated as they
have been in several recent examples. This is aggravated when - as in
one case - the ontologies in question use different methods of
importing the same ontology. This means that my zip file must include
a file that records *all* the different ways in which the owl files
are downloaded. As different tools will use different versions of
this file, I will need to convert the file to all the different
formats. Seems very awkward.
Proposal:
My proposal is to use import by name but to allow an annotation that
provides a hint as to where to find the imported ontology. Thus a
good import of the http://purl.org/obo/owl/sequence ontology could be
<owl:Ontology rdf:about="">
<owl:imports>
<owl:Ontology rdf:about="http://purl.org/obo/owl/sequence">
<owl:hasPhysicalLocaition>
http://www.berkeleybop.org/ontologies/obo-all/sequence/sequence.owl
</owl:hasPhysicalLocation>
</owl:Ontology>
...
This hint can be ignored in many cases (e.g. use case 2) where the
hint is known to be wrong or it doesn't work.
I think this needs to be a standard because it would need to be
understood by a variety of tools using different api's and written in
different languages. I think that it is important enough because
there is a significant amount of traffic devoted to problems related
this sort of import problem. In particular we are seeing traffic
about this subject on two tool forums where the tools are based on
entirely different owl api's with an independent ancestry.
-Timothy
Received on Sunday, 2 December 2007 04:07:39 UTC