Re: Ontology locations: OntologyURI vs. xml:base and namespaces (ISSUE-21)

Hi Boris,

Thanks for the clarification: indeed you address my worry. In fact,  
Matthew pointed at some misconceptions in my original mail as well  
wrt. namespace handling in Protege4.

As I now understand it, you are basically saying that xml:base does  
not really have anything to do with the imports/versioning scheme.  
However, I still don't think this is entirely true because of the two  
scenarios' I described in my mail:

>> 1) An empty rdf:about on the owl:Ontology element: the id/URI of the
>> owl:Ontology element will be interpreted as being the same as the
>> value of the xml:base attribute of the RDF/XML file. If the xml:base
>> is not specified or empty (?), the OntologyURI will be the same as  
>> the
>> physical location of the file (either online or on a local harddisk),
>> because the xml:base is considered to be the same as that physical
>> location.
>> 2) No rdf:about on the owl:Ontology element: the owl:Ontology is
>> nameless. Currently TopBraid will add a new owl:Ontology element with
>> an empty rdf:about and then complains that the file contains two
>> owl:Ontology elements. Protege4 currently does the proper thing, and
>> interprets the owl:Ontology to be anonymous, i.e. its URI will be
>> inferred to be something like 'xml:base'#'generatedID'.


The xml:base *does* in both cases determine the ontology URI, though  
I'm sure tools won't have a problem loading the proper ontology. The  
FS to RDF mapping only generates a bnode if there's no OntologyURI,  
which is not the same as an empty OntologyURI.

A second leftover is that I now don't know what to make of the wording  
"the base URI is the URI used to retrieve the document entity or  
external entity" [2]. This does seem to suggest that the xml:base  
should be used to indicate the preferred location of a document (or  
entity). The xml:base-as-abbreviation and xml:base-as-location do not  
seem to go together well I guess.

-Rinke


On 21 mei 2008, at 11:50, Boris Motik wrote:

> Hello Rinke,
>
> If I understood your message correctly, you are worried about  
> possible mismatches between xml:base, ontologyURI and versionURI,
> right?
>
>
> If this is the case, I would just like to point out that xml:base  
> lives at a completely different level in the "food chain" than the
> ontologyURI and the versionURI. That is, xml:base is a low-level XML  
> mechanism for resolution of relative URIs and, is reflected
> neither in the RDF model nor in the OWL 2 structural specification.  
> The resolution of URIs relative to xml:base happens at the level
> of the parsers processing various XML-based syntaxes. In the case of  
> OWL/RDF-XML, this happens before you even see any triples; in
> case of OWL/XML, this happens before you see the axioms.
>
> Thus, the presence and/or the content of xml:base is completely  
> orthogonal to location and version concerns. xml:base can be equal
> to ontologyURI and/or versionURI, but it does not need to be. Please  
> note that xml:base is not a property of an XML document: ANY
> element in an XML document can have an xml:base specification, which  
> is then used while parsing this element's children. This just
> reinforces my belief that xml:base is something similar to expansion  
> of named entities: it happens at the XML level.
>
> Thus, if xml:base is there, a parser should process it to resolve  
> the relative URIs in the document, and if it is not, a parser
> should resolve the relative URIs against the location that the  
> ontology was loaded from. It should be clear that the usage of
> relative URIs without xml:base is dangerous; however, this is a low- 
> level syntax issue that has nothing to do with the structural
> specification. Finally, it should also be clear that ontologyURI and  
> versionURI play no part whatsoever in the resolution of the
> relative URIs.
>
> If you feel this is necessary, we can make these issues clear in the  
> documents rescribing the respective syntaxes (but not in the
> structural specification document, which doesn't know anything about  
> xml:base at all).
>
> Regards,
>
> 	Boris
>
>
>
>> -----Original Message-----
>> From: public-owl-wg-request@w3.org [mailto:public-owl-wg-request@w3.org 
>> ] On Behalf Of Rinke Hoekstra
>> Sent: 19 May 2008 14:20
>> To: OWL Working Group WG
>> Subject: Ontology locations: OntologyURI vs. xml:base and  
>> namespaces (ISSUE-21)
>>
>>
>> Hi All,
>>
>> (long email, sorry)
>>
>> Although the current publishing guidelines and imports sections are
>> masterpieces of clarity (thanks Boris), I feel that the consequences
>> of imports-by-location resolution to ISSUE-21 are still not entirely
>> covered.
>>
>> If I understand correctly, the use of the OntologyURI in Boris' and
>> Peter's proposal allows us to keep track of where our axioms and
>> objects come from, or rather, where they 'belong'. It is, in fact,
>> interpreted as a URL. The VersionURI is an additional construct that
>> can be used to add an integrity check when the ontology specified by
>> the OntologyURI is not physically located at that URI. Nonetheless,
>> when an ontology is imported from the VersionURI, it is still to be
>> interpreted *as if* it came from the OntologyURI.
>>
>> However, XML has a built-in way of managing the 'what came from
>> where?' question: through the xml:base attribute. This is not
>> precisely what RFC2396 [1] says, but it is a very common
>> interpretation of the value of the xml:base defined on the root
>> element of an XML file, cf. [2] which states that "the base URI is  
>> the
>> URI used to retrieve the document entity or external entity". This is
>> what OntologyURI does in import statements.
>>
>> The way in which current tools (TopBraid/Jena and Protege4/OWL API)
>> deal with imports and the like is primarily through this xml:base
>> attribute. In fact, the RDF/XML serialisation of both leaves the
>> rdf:about attribute on the owl:Ontology element empty. Also, both  
>> give
>> a warning and do some repair when the OntologyURI is not the same as
>> the xml:base (or empty). Additionally, both tools specify the default
>> namespace as equal to the xml:base. In conclusion: the current state
>> is that ontologyuri=xml:base=namespace. I think this is quite
>> sensible, as it allows all relative URIs of classes etc. to be
>> relative to the Ontology URI (see my last point on imports at the end
>> of this message)
>>
>> This use of xml:base interferes with the proposed 'rules' in several
>> ways. The rules are publishing guidelines, i.e. they are  
>> prescriptions
>> of where an ontology creator SHOULD publish his/her ontology, rather
>> than a specification for retrieving the proper ontology given some
>> OntologyURI, VersionURI or both. But I don't think the two
>> perspectives can easily be disentangled.
>>
>> Rule 1:
>> If O does not contain an ontology URI (and, consequently, without a
>> version URI as well), then O can be physically located anywhere.
>>
>> There are two scenario's (in the RDF/XML case):
>>
>> 1) An empty rdf:about on the owl:Ontology element: the id/URI of the
>> owl:Ontology element will be interpreted as being the same as the
>> value of the xml:base attribute of the RDF/XML file. If the xml:base
>> is not specified or empty (?), the OntologyURI will be the same as  
>> the
>> physical location of the file (either online or on a local harddisk),
>> because the xml:base is considered to be the same as that physical
>> location.
>> 2) No rdf:about on the owl:Ontology element: the owl:Ontology is
>> nameless. Currently TopBraid will add a new owl:Ontology element with
>> an empty rdf:about and then complains that the file contains two
>> owl:Ontology elements. Protege4 currently does the proper thing, and
>> interprets the owl:Ontology to be anonymous, i.e. its URI will be
>> inferred to be something like 'xml:base'#'generatedID'.
>>
>> This touches on ISSUE-15, which was resolved: ontologies can in fact
>> be without name.
>>
>> Rule 2:
>> If O contains an ontology URI ou but no version URI, then O should be
>> physically located at the location ou.
>>
>> This serves exactly the same function as the common use of the
>> xml:base attribute.
>>
>> * If O contains an ontology URI ou and a version URI vu, then O  
>> should
>> be physically located at the location ou or vu.
>>
>> In this case, the combination of ou and vu says 'yes this is the
>> ontology ou, but I may also be found at vu'. This evidently adds
>> functionality over the use of just xml:base and allows an ontology
>> publisher to provide extra information on the whereabouts of the
>> ontology on the web for versioning purposes. Nonetheless, when
>> retrieving the ontology from vu through an imports statement, it is
>> interpreted as coming from ou. This complies with the principle
>> ou=xml:base=namespace.
>>
>> RDF/XML, Turtle and XML, each support (some form of specifying) an
>> xml:base, the only syntax that does not support it is the functional
>> style syntax.
>>
>> I might be mistaken, but if imports is by location, then OntologyURI
>> has no real value over just using xml:base, apart from the fact that
>> the fss does not support it.
>>
>> As I see it, there are three ways out of it, given that we do have
>> imports by location:
>>
>> 1) Drop OntologyURI alltogether, and just use xml:base + VersionURI
>> 2) Do not drop OntologyURI, but enforce it to be equal to xml:base  
>> (or
>> empty). Make the appropriate adjustments to the mapping to ensure  
>> that
>> a missing OntologyURI (i.e. rdf:about) is not translated to a bnode.
>> 3) Treat OntologyURI similar to VersionURI, and say that the file
>> owl:imports points to should have the same base URI as the import  
>> URI,
>> or should have the same ontology URI as the import URI, or should  
>> have
>> the same version URI as the import URI.
>>
>> A combination of 2 and 3 is possible as well.
>>
>> A last issue, that I'm still chewing on is what happens when an
>> ontology imports another ontology from the versionURI location. The
>> currrent practice in P4 is imports by name: imported classes are in
>> the namespace of the imported ontology, and their names are relative
>> to its ontology uri. In other words:
>> importsuri=ontologyuri=xml:base=namespace. The imports by location
>> principle breaks this chain, as the imports URI is does longer have  
>> to
>> be the same as the ontology URI. But in the case where some specific
>> version URI is imported, all local names of imported classes are  
>> still
>> relative to the ontology URI. This means that someone who imports an
>> ontology has no longer any way of knowing the namespace of the
>> imported classes. Don't really know whether that's bad... but it's
>> good to realise this consequence.
>>
>> -Rinke
>>
>> [1] http://www.ietf.org/rfc/rfc2396.txt
>> [2] http://www.w3.org/TR/xmlbase/
>> -----------------------------------------------
>> Drs. Rinke Hoekstra
>>
>> Email: hoekstra@uva.nl    Skype:  rinkehoekstra
>> Phone: +31-20-5253499     Fax:   +31-20-5253495
>> Web:   http://www.leibnizcenter.org/users/rinke
>>
>> Leibniz Center for Law,          Faculty of Law
>> University of Amsterdam,            PO Box 1030
>> 1000 BA  Amsterdam,             The Netherlands
>> -----------------------------------------------
>>
>>
>>
>

-----------------------------------------------
Drs. Rinke Hoekstra

Email: hoekstra@uva.nl    Skype:  rinkehoekstra
Phone: +31-20-5253499     Fax:   +31-20-5253495
Web:   http://www.leibnizcenter.org/users/rinke

Leibniz Center for Law,          Faculty of Law
University of Amsterdam,            PO Box 1030
1000 BA  Amsterdam,             The Netherlands
-----------------------------------------------

Received on Wednesday, 21 May 2008 14:27:03 UTC