- From: Timothy Redmond <tredmond@stanford.edu>
- Date: Sat, 1 Dec 2007 15:46:50 -0800
- To: public-owl-dev@w3.org
- Cc: Chris Mungall <cjm@fruitfly.org>, Stephen Larson <slarson@ncmir.ucsd.edu>, Bill Bug <wbug@ncmir.ucsd.edu>
I hope that this is an appropriate and acceptable message. For some time now some tool developers and users have been having increasing difficulty with the owl import declaration. I have a proposal but I would really like some advice on any misconceptions I might have, whether this is good approach and how the OWL 1.1 standard will address this issue. The Problem: Somewhere on the web, there is an ontology that contains the following statements: <?xml version="1.0"?> <rdf:RDF ... xml:base="http://purl.org/obo/owl/sequence"> ... <owl:Ontology rdf:about=""> According to my reading of the specifications, this means that the ontology in question is called http://purl.org/obo/owl/sequence and that the proper way to import this ontology is <owl:Ontology rdf:about=""> <owl:imports rdf:resource="http://purl.org/obo/owl/sequence"/> ... (please forgive faulty rdf...). I will call this method of writing an import "import by name". However, when I went to the web page given by http://purl.org/obo/owl/sequence I get a redirect followed by a not found error. So anyone who finds the importing ontology has no way of finding the imported ontology. When a developer or user knows were the ontology is located, they can use various mechanisms to tell their tool where to find the ontology. But none of these mechanisms are compatible. In addition, in many cases, ontology authors cannot control the ontologies that they wish to import. For these reasons, many ontology authors have taken to doing import by location. Tool builders are finding themselves forced to accommodate. In my opinion import by location (or some hybrid) doesn't solve anything though as I argue below. Use Cases: 1. The internet is trusted, available and reliable, ontologies are never relocated and all the ontologies of interest are on the internet. This is the one case where import by location shines and import by name does very badly. With import by name, a person reading an ontology off the web may not be able to determine where to find the imported ontology. I will lump the other use cases together but they may have important differences. 2. I am commuting home from work with no internet access and unzip a collection of owl files. 3. I am developing an application which may not have access to the internet and/or may not be willing to trust the internet even if it had access. 4. I have access to the internet but I want to edit some (must be more than one) ontology that I download off the web. 5. Web servers, projects and organizations come and go and ontologies are relocated. In these cases, to varying degrees, import by name works very well and this is why I think it is the right choice. (Cases 2 and 3 are close to my heart.) Consider use case 2 because in some sense it is an extreme. In this case - with import by name - I simply plop the owl files on my disk and my tool can easily determine which ontologies import which. It just needs to parse the import statements and the ontology declarations from the files in question. Import by location fares much worse in this case. My tool has no way of figuring out which ontology imports which - it must be told. If the zip file only contains owl files then it is a human who must figure out the imports. Also import trees can be pretty complicated as they have been in several recent examples. This is aggravated when - as in one case - the ontologies in question use different methods of importing the same ontology. This means that my zip file must include a file that records *all* the different ways in which the owl files are downloaded. As different tools will use different versions of this file, I will need to convert the file to all the different formats. Seems very awkward. Proposal: My proposal is to use import by name but to allow an annotation that provides a hint as to where to find the imported ontology. Thus a good import of the http://purl.org/obo/owl/sequence ontology could be <owl:Ontology rdf:about=""> <owl:imports> <owl:Ontology rdf:about="http://purl.org/obo/owl/sequence"> <owl:hasPhysicalLocaition> http://www.berkeleybop.org/ontologies/obo-all/sequence/sequence.owl </owl:hasPhysicalLocation> </owl:Ontology> ... This hint can be ignored in many cases (e.g. use case 2) where the hint is known to be wrong or it doesn't work. I think this needs to be a standard because it would need to be understood by a variety of tools using different api's and written in different languages. I think that it is important enough because there is a significant amount of traffic devoted to problems related this sort of import problem. In particular we are seeing traffic about this subject on two tool forums where the tools are based on entirely different owl api's with an independent ancestry. -Timothy
Received on Sunday, 2 December 2007 04:07:39 UTC