Ontology locations: OntologyURI vs. xml:base and namespaces (ISSUE-21)

Hi All,

(long email, sorry)

Although the current publishing guidelines and imports sections are  
masterpieces of clarity (thanks Boris), I feel that the consequences  
of imports-by-location resolution to ISSUE-21 are still not entirely  
covered.

If I understand correctly, the use of the OntologyURI in Boris' and  
Peter's proposal allows us to keep track of where our axioms and  
objects come from, or rather, where they 'belong'. It is, in fact,  
interpreted as a URL. The VersionURI is an additional construct that  
can be used to add an integrity check when the ontology specified by  
the OntologyURI is not physically located at that URI. Nonetheless,  
when an ontology is imported from the VersionURI, it is still to be  
interpreted *as if* it came from the OntologyURI.

However, XML has a built-in way of managing the 'what came from  
where?' question: through the xml:base attribute. This is not  
precisely what RFC2396 [1] says, but it is a very common  
interpretation of the value of the xml:base defined on the root  
element of an XML file, cf. [2] which states that "the base URI is the  
URI used to retrieve the document entity or external entity". This is  
what OntologyURI does in import statements.

The way in which current tools (TopBraid/Jena and Protege4/OWL API)  
deal with imports and the like is primarily through this xml:base  
attribute. In fact, the RDF/XML serialisation of both leaves the  
rdf:about attribute on the owl:Ontology element empty. Also, both give  
a warning and do some repair when the OntologyURI is not the same as  
the xml:base (or empty). Additionally, both tools specify the default  
namespace as equal to the xml:base. In conclusion: the current state  
is that ontologyuri=xml:base=namespace. I think this is quite  
sensible, as it allows all relative URIs of classes etc. to be  
relative to the Ontology URI (see my last point on imports at the end  
of this message)

This use of xml:base interferes with the proposed 'rules' in several  
ways. The rules are publishing guidelines, i.e. they are prescriptions  
of where an ontology creator SHOULD publish his/her ontology, rather  
than a specification for retrieving the proper ontology given some  
OntologyURI, VersionURI or both. But I don't think the two  
perspectives can easily be disentangled.

Rule 1:
If O does not contain an ontology URI (and, consequently, without a  
version URI as well), then O can be physically located anywhere.

There are two scenario's (in the RDF/XML case):

1) An empty rdf:about on the owl:Ontology element: the id/URI of the  
owl:Ontology element will be interpreted as being the same as the  
value of the xml:base attribute of the RDF/XML file. If the xml:base  
is not specified or empty (?), the OntologyURI will be the same as the  
physical location of the file (either online or on a local harddisk),  
because the xml:base is considered to be the same as that physical  
location.
2) No rdf:about on the owl:Ontology element: the owl:Ontology is  
nameless. Currently TopBraid will add a new owl:Ontology element with  
an empty rdf:about and then complains that the file contains two  
owl:Ontology elements. Protege4 currently does the proper thing, and  
interprets the owl:Ontology to be anonymous, i.e. its URI will be  
inferred to be something like 'xml:base'#'generatedID'.

This touches on ISSUE-15, which was resolved: ontologies can in fact  
be without name.

Rule 2:
If O contains an ontology URI ou but no version URI, then O should be  
physically located at the location ou.

This serves exactly the same function as the common use of the  
xml:base attribute.

* If O contains an ontology URI ou and a version URI vu, then O should  
be physically located at the location ou or vu.

In this case, the combination of ou and vu says 'yes this is the  
ontology ou, but I may also be found at vu'. This evidently adds  
functionality over the use of just xml:base and allows an ontology  
publisher to provide extra information on the whereabouts of the  
ontology on the web for versioning purposes. Nonetheless, when  
retrieving the ontology from vu through an imports statement, it is  
interpreted as coming from ou. This complies with the principle  
ou=xml:base=namespace.

RDF/XML, Turtle and XML, each support (some form of specifying) an  
xml:base, the only syntax that does not support it is the functional  
style syntax.

I might be mistaken, but if imports is by location, then OntologyURI  
has no real value over just using xml:base, apart from the fact that  
the fss does not support it.

As I see it, there are three ways out of it, given that we do have  
imports by location:

1) Drop OntologyURI alltogether, and just use xml:base + VersionURI
2) Do not drop OntologyURI, but enforce it to be equal to xml:base (or  
empty). Make the appropriate adjustments to the mapping to ensure that  
a missing OntologyURI (i.e. rdf:about) is not translated to a bnode.
3) Treat OntologyURI similar to VersionURI, and say that the file  
owl:imports points to should have the same base URI as the import URI,  
or should have the same ontology URI as the import URI, or should have  
the same version URI as the import URI.

A combination of 2 and 3 is possible as well.

A last issue, that I'm still chewing on is what happens when an  
ontology imports another ontology from the versionURI location. The  
currrent practice in P4 is imports by name: imported classes are in  
the namespace of the imported ontology, and their names are relative  
to its ontology uri. In other words:  
importsuri=ontologyuri=xml:base=namespace. The imports by location  
principle breaks this chain, as the imports URI is does longer have to  
be the same as the ontology URI. But in the case where some specific  
version URI is imported, all local names of imported classes are still  
relative to the ontology URI. This means that someone who imports an  
ontology has no longer any way of knowing the namespace of the  
imported classes. Don't really know whether that's bad... but it's  
good to realise this consequence.

-Rinke

[1] http://www.ietf.org/rfc/rfc2396.txt
[2] http://www.w3.org/TR/xmlbase/
-----------------------------------------------
Drs. Rinke Hoekstra

Email: hoekstra@uva.nl    Skype:  rinkehoekstra
Phone: +31-20-5253499     Fax:   +31-20-5253495
Web:   http://www.leibnizcenter.org/users/rinke

Leibniz Center for Law,          Faculty of Law
University of Amsterdam,            PO Box 1030
1000 BA  Amsterdam,             The Netherlands
-----------------------------------------------

Received on Monday, 19 May 2008 13:20:45 UTC