Re: Ontology locations: OntologyURI vs. xml:base and namespaces (ISSUE-21)

Ok, thanks again. That makes perfect sense!

-Rinke

On 21 mei 2008, at 16:43, Boris Motik wrote:

> Hello,
>
>> -----Original Message-----
>> From: public-owl-wg-request@w3.org [mailto:public-owl-wg-request@w3.org 
>> ] On Behalf Of Rinke Hoekstra
>> Sent: 21 May 2008 15:26
>> To: Boris Motik
>> Cc: 'OWL Working Group WG'
>> Subject: Re: Ontology locations: OntologyURI vs. xml:base and  
>> namespaces (ISSUE-21)
>>
>>
>> Hi Boris,
>>
>> Thanks for the clarification: indeed you address my worry. In fact,
>> Matthew pointed at some misconceptions in my original mail as well
>> wrt. namespace handling in Protege4.
>>
>> As I now understand it, you are basically saying that xml:base does
>> not really have anything to do with the imports/versioning scheme.
>> However, I still don't think this is entirely true because of the two
>> scenarios' I described in my mail:
>>
>>>> 1) An empty rdf:about on the owl:Ontology element: the id/URI of  
>>>> the
>>>> owl:Ontology element will be interpreted as being the same as the
>>>> value of the xml:base attribute of the RDF/XML file. If the  
>>>> xml:base
>>>> is not specified or empty (?), the OntologyURI will be the same as
>>>> the
>>>> physical location of the file (either online or on a local  
>>>> harddisk),
>>>> because the xml:base is considered to be the same as that physical
>>>> location.
>>>> 2) No rdf:about on the owl:Ontology element: the owl:Ontology is
>>>> nameless. Currently TopBraid will add a new owl:Ontology element  
>>>> with
>>>> an empty rdf:about and then complains that the file contains two
>>>> owl:Ontology elements. Protege4 currently does the proper thing,  
>>>> and
>>>> interprets the owl:Ontology to be anonymous, i.e. its URI will be
>>>> inferred to be something like 'xml:base'#'generatedID'.
>>
>>
>> The xml:base *does* in both cases determine the ontology URI, though
>> I'm sure tools won't have a problem loading the proper ontology. The
>> FS to RDF mapping only generates a bnode if there's no OntologyURI,
>> which is not the same as an empty OntologyURI.
>>
>
> If you do have the empty about element, it is true that it will be  
> resolved according to the xml:base. This, however, has nothing to
> do with imports and/or versioning. Think of xml:base as some macro  
> expansion that happens before you even parse the ontology. In
> fact, it is the same as if you used a named entity to specify the  
> ontology URI: the named entity would get expanded by the XML
> parser before you even saw the abbreviated URI. xml:base is  
> completely analogous, with the minor difference that it cannot be
> handled by the XML parser: an XML parser does not know "the value of  
> this element is a URI so let me expand it according to
> xml:base". This expansion therefore has to be performed by "your"  
> parser; however, it can be done way before the XML gets actually
> transformed into OWL.
>
> Thus, xml:base is just a macro. If you use empty about tags, it is  
> up to you to fix up the appropriate xml:base so that the macro
> gets expanded appropriately. If your macro gets expanded incorrectly  
> (i.e., differently from what you'd expect), well, this is your
> problem (as it always is with macro systems). The URI of the  
> ontology is nonetheless equal to whatever is performed by the macro
> expansion at the lower level.
>
>> A second leftover is that I now don't know what to make of the  
>> wording
>> "the base URI is the URI used to retrieve the document entity or
>> external entity" [2]. This does seem to suggest that the xml:base
>> should be used to indicate the preferred location of a document (or
>> entity). The xml:base-as-abbreviation and xml:base-as-location do not
>> seem to go together well I guess.
>>
>
> It might be that TopBraid is interpreting the spec incorrectly;  
> however, this does not make the spec incorrect. (At least I hope so!
> :-)
>
> I don't think it is appropriate to think of xml:base-as-location at  
> all. xml:base can occur anywhere in an ontology document (not
> only on the outermost element), and it is then valid only on the  
> children of the element that it occurs on. Thus, this strongly
> suggests that xml:base is jus an abbreviation mechanism, just like  
> XML namespaces are.
>
> I've added an explanation about this independence into the XML  
> syntax document. For XML RDF, I believe that this belongs to the RDF
> spec and not our documents. (After all, we are dealing just with RDF  
> graphs.)
>
> Regards,
>
> 	Boris
>
>> -Rinke
>>
>>
>> On 21 mei 2008, at 11:50, Boris Motik wrote:
>>
>>> Hello Rinke,
>>>
>>> If I understood your message correctly, you are worried about
>>> possible mismatches between xml:base, ontologyURI and versionURI,
>>> right?
>>>
>>>
>>> If this is the case, I would just like to point out that xml:base
>>> lives at a completely different level in the "food chain" than the
>>> ontologyURI and the versionURI. That is, xml:base is a low-level XML
>>> mechanism for resolution of relative URIs and, is reflected
>>> neither in the RDF model nor in the OWL 2 structural specification.
>>> The resolution of URIs relative to xml:base happens at the level
>>> of the parsers processing various XML-based syntaxes. In the case of
>>> OWL/RDF-XML, this happens before you even see any triples; in
>>> case of OWL/XML, this happens before you see the axioms.
>>>
>>> Thus, the presence and/or the content of xml:base is completely
>>> orthogonal to location and version concerns. xml:base can be equal
>>> to ontologyURI and/or versionURI, but it does not need to be. Please
>>> note that xml:base is not a property of an XML document: ANY
>>> element in an XML document can have an xml:base specification, which
>>> is then used while parsing this element's children. This just
>>> reinforces my belief that xml:base is something similar to expansion
>>> of named entities: it happens at the XML level.
>>>
>>> Thus, if xml:base is there, a parser should process it to resolve
>>> the relative URIs in the document, and if it is not, a parser
>>> should resolve the relative URIs against the location that the
>>> ontology was loaded from. It should be clear that the usage of
>>> relative URIs without xml:base is dangerous; however, this is a low-
>>> level syntax issue that has nothing to do with the structural
>>> specification. Finally, it should also be clear that ontologyURI and
>>> versionURI play no part whatsoever in the resolution of the
>>> relative URIs.
>>>
>>> If you feel this is necessary, we can make these issues clear in the
>>> documents rescribing the respective syntaxes (but not in the
>>> structural specification document, which doesn't know anything about
>>> xml:base at all).
>>>
>>> Regards,
>>>
>>> 	Boris
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: public-owl-wg-request@w3.org [mailto:public-owl-wg-request@w3.org
>>>> ] On Behalf Of Rinke Hoekstra
>>>> Sent: 19 May 2008 14:20
>>>> To: OWL Working Group WG
>>>> Subject: Ontology locations: OntologyURI vs. xml:base and
>>>> namespaces (ISSUE-21)
>>>>
>>>>
>>>> Hi All,
>>>>
>>>> (long email, sorry)
>>>>
>>>> Although the current publishing guidelines and imports sections are
>>>> masterpieces of clarity (thanks Boris), I feel that the  
>>>> consequences
>>>> of imports-by-location resolution to ISSUE-21 are still not  
>>>> entirely
>>>> covered.
>>>>
>>>> If I understand correctly, the use of the OntologyURI in Boris' and
>>>> Peter's proposal allows us to keep track of where our axioms and
>>>> objects come from, or rather, where they 'belong'. It is, in fact,
>>>> interpreted as a URL. The VersionURI is an additional construct  
>>>> that
>>>> can be used to add an integrity check when the ontology specified  
>>>> by
>>>> the OntologyURI is not physically located at that URI. Nonetheless,
>>>> when an ontology is imported from the VersionURI, it is still to be
>>>> interpreted *as if* it came from the OntologyURI.
>>>>
>>>> However, XML has a built-in way of managing the 'what came from
>>>> where?' question: through the xml:base attribute. This is not
>>>> precisely what RFC2396 [1] says, but it is a very common
>>>> interpretation of the value of the xml:base defined on the root
>>>> element of an XML file, cf. [2] which states that "the base URI is
>>>> the
>>>> URI used to retrieve the document entity or external entity".  
>>>> This is
>>>> what OntologyURI does in import statements.
>>>>
>>>> The way in which current tools (TopBraid/Jena and Protege4/OWL API)
>>>> deal with imports and the like is primarily through this xml:base
>>>> attribute. In fact, the RDF/XML serialisation of both leaves the
>>>> rdf:about attribute on the owl:Ontology element empty. Also, both
>>>> give
>>>> a warning and do some repair when the OntologyURI is not the same  
>>>> as
>>>> the xml:base (or empty). Additionally, both tools specify the  
>>>> default
>>>> namespace as equal to the xml:base. In conclusion: the current  
>>>> state
>>>> is that ontologyuri=xml:base=namespace. I think this is quite
>>>> sensible, as it allows all relative URIs of classes etc. to be
>>>> relative to the Ontology URI (see my last point on imports at the  
>>>> end
>>>> of this message)
>>>>
>>>> This use of xml:base interferes with the proposed 'rules' in  
>>>> several
>>>> ways. The rules are publishing guidelines, i.e. they are
>>>> prescriptions
>>>> of where an ontology creator SHOULD publish his/her ontology,  
>>>> rather
>>>> than a specification for retrieving the proper ontology given some
>>>> OntologyURI, VersionURI or both. But I don't think the two
>>>> perspectives can easily be disentangled.
>>>>
>>>> Rule 1:
>>>> If O does not contain an ontology URI (and, consequently, without a
>>>> version URI as well), then O can be physically located anywhere.
>>>>
>>>> There are two scenario's (in the RDF/XML case):
>>>>
>>>> 1) An empty rdf:about on the owl:Ontology element: the id/URI of  
>>>> the
>>>> owl:Ontology element will be interpreted as being the same as the
>>>> value of the xml:base attribute of the RDF/XML file. If the  
>>>> xml:base
>>>> is not specified or empty (?), the OntologyURI will be the same as
>>>> the
>>>> physical location of the file (either online or on a local  
>>>> harddisk),
>>>> because the xml:base is considered to be the same as that physical
>>>> location.
>>>> 2) No rdf:about on the owl:Ontology element: the owl:Ontology is
>>>> nameless. Currently TopBraid will add a new owl:Ontology element  
>>>> with
>>>> an empty rdf:about and then complains that the file contains two
>>>> owl:Ontology elements. Protege4 currently does the proper thing,  
>>>> and
>>>> interprets the owl:Ontology to be anonymous, i.e. its URI will be
>>>> inferred to be something like 'xml:base'#'generatedID'.
>>>>
>>>> This touches on ISSUE-15, which was resolved: ontologies can in  
>>>> fact
>>>> be without name.
>>>>
>>>> Rule 2:
>>>> If O contains an ontology URI ou but no version URI, then O  
>>>> should be
>>>> physically located at the location ou.
>>>>
>>>> This serves exactly the same function as the common use of the
>>>> xml:base attribute.
>>>>
>>>> * If O contains an ontology URI ou and a version URI vu, then O
>>>> should
>>>> be physically located at the location ou or vu.
>>>>
>>>> In this case, the combination of ou and vu says 'yes this is the
>>>> ontology ou, but I may also be found at vu'. This evidently adds
>>>> functionality over the use of just xml:base and allows an ontology
>>>> publisher to provide extra information on the whereabouts of the
>>>> ontology on the web for versioning purposes. Nonetheless, when
>>>> retrieving the ontology from vu through an imports statement, it is
>>>> interpreted as coming from ou. This complies with the principle
>>>> ou=xml:base=namespace.
>>>>
>>>> RDF/XML, Turtle and XML, each support (some form of specifying) an
>>>> xml:base, the only syntax that does not support it is the  
>>>> functional
>>>> style syntax.
>>>>
>>>> I might be mistaken, but if imports is by location, then  
>>>> OntologyURI
>>>> has no real value over just using xml:base, apart from the fact  
>>>> that
>>>> the fss does not support it.
>>>>
>>>> As I see it, there are three ways out of it, given that we do have
>>>> imports by location:
>>>>
>>>> 1) Drop OntologyURI alltogether, and just use xml:base + VersionURI
>>>> 2) Do not drop OntologyURI, but enforce it to be equal to xml:base
>>>> (or
>>>> empty). Make the appropriate adjustments to the mapping to ensure
>>>> that
>>>> a missing OntologyURI (i.e. rdf:about) is not translated to a  
>>>> bnode.
>>>> 3) Treat OntologyURI similar to VersionURI, and say that the file
>>>> owl:imports points to should have the same base URI as the import
>>>> URI,
>>>> or should have the same ontology URI as the import URI, or should
>>>> have
>>>> the same version URI as the import URI.
>>>>
>>>> A combination of 2 and 3 is possible as well.
>>>>
>>>> A last issue, that I'm still chewing on is what happens when an
>>>> ontology imports another ontology from the versionURI location. The
>>>> currrent practice in P4 is imports by name: imported classes are in
>>>> the namespace of the imported ontology, and their names are  
>>>> relative
>>>> to its ontology uri. In other words:
>>>> importsuri=ontologyuri=xml:base=namespace. The imports by location
>>>> principle breaks this chain, as the imports URI is does longer have
>>>> to
>>>> be the same as the ontology URI. But in the case where some  
>>>> specific
>>>> version URI is imported, all local names of imported classes are
>>>> still
>>>> relative to the ontology URI. This means that someone who imports  
>>>> an
>>>> ontology has no longer any way of knowing the namespace of the
>>>> imported classes. Don't really know whether that's bad... but it's
>>>> good to realise this consequence.
>>>>
>>>> -Rinke
>>>>
>>>> [1] http://www.ietf.org/rfc/rfc2396.txt
>>>> [2] http://www.w3.org/TR/xmlbase/
>>>> -----------------------------------------------
>>>> Drs. Rinke Hoekstra
>>>>
>>>> Email: hoekstra@uva.nl    Skype:  rinkehoekstra
>>>> Phone: +31-20-5253499     Fax:   +31-20-5253495
>>>> Web:   http://www.leibnizcenter.org/users/rinke
>>>>
>>>> Leibniz Center for Law,          Faculty of Law
>>>> University of Amsterdam,            PO Box 1030
>>>> 1000 BA  Amsterdam,             The Netherlands
>>>> -----------------------------------------------
>>>>
>>>>
>>>>
>>>
>>
>> -----------------------------------------------
>> Drs. Rinke Hoekstra
>>
>> Email: hoekstra@uva.nl    Skype:  rinkehoekstra
>> Phone: +31-20-5253499     Fax:   +31-20-5253495
>> Web:   http://www.leibnizcenter.org/users/rinke
>>
>> Leibniz Center for Law,          Faculty of Law
>> University of Amsterdam,            PO Box 1030
>> 1000 BA  Amsterdam,             The Netherlands
>> -----------------------------------------------
>>
>>
>>
>

-----------------------------------------------
Drs. Rinke Hoekstra

Email: hoekstra@uva.nl    Skype:  rinkehoekstra
Phone: +31-20-5253499     Fax:   +31-20-5253495
Web:   http://www.leibnizcenter.org/users/rinke

Leibniz Center for Law,          Faculty of Law
University of Amsterdam,            PO Box 1030
1000 BA  Amsterdam,             The Netherlands
-----------------------------------------------

Received on Wednesday, 21 May 2008 14:58:07 UTC