Re: linking data with ontology

On 7/21/05, mmonroy@usbctg.edu.co <mmonroy@usbctg.edu.co> wrote:

> We are studing ontologies and we want to make an application to interact with
> information regarding social problems in Colombia. We are realy confused in how
> to separate ontologies from data. Probably we are confused in the definition of
> the ontology but our main question is Where the ontology finish and where the
> data begin?.

I am not an ontologist, but I think I'm on safe ground in saying in
general this depends on the kind of data you're working with and what
kind of processing is required. The answer to your question is
probably contained in the answer to: "why do you want to separate
ontologies from data?".

In the context of RDF/OWL, it seems to me that at one end of the
application spectrum you've got what might loosely be called
data-oriented apps, where what's most important is the data model,
exploiting the graph structure and ability to mix data from diverse
sources. What ontology definitions there are won't really go beyond
simple class membership, property domain/range kind of of things. What
processing there is of the data will be largely be done
programmatically, maybe app-specific on top of what can be conceived
of as a general purpose database. At the other end of the spectrum
there are the heavily ontological apps, where the (onto)logical
structure is most important, and most of the processing will be done
using domain-independent tools built to inference across the
class/property structure, with all the constraints OWL can provide.
[Another way of looking at what I'm calling a spectrum is how far up
the SW layer cake one is going, but I think in the present context a
spectral analogy maybe works better - turn the cake on its side ;-)].

Going back to separating ontology and data, for the second kind of app
you probably will want to maintain a clear distinction between
ontology information and instance data, to enable efficient inference.
There's a direct reflection here of the Description Logic distinction
between T-box (terminological definitions) and A-box (assertions). In
practice this would mean avoiding mixing the material between the
'boxes', not treating constructs from the T-box as constructs from the
A-box, like don't treat classes as instances. Yep, that's OWL DL. When
building ontologies the terminological material will generally be
created in fairly self-contained models probably contained in schema
files, although when imports and inferred statements are taken into
account there may be a lot more there than initially meets the eye.

For the first kind of app, being more concerned with instance data,
with regular code looking after the logic, and general-purpose
ontology based inference isn't needed you can get by without the same
constraints. In those circumstances it is reasonable to treat classes
as instances and so on. So you're looking at OWL Full/RDF Schema.
There's still some utility in keeping the representations of the term
and instance info separate, e.g. having separate RDF/XML files for RDF
Schema and raw data, but there isn't the same separation at the
logical level.

But in either case, all the statements, whether ontological or
instance data-oriented, can (and often do) appear side by side in the
same store or serializations. The distinction between what is ontology
and what is instance data really appears when you do things with the
statements, i.e. run an inference engine on them, or make a query.

This kind of data- vs. ontology-orientation is a very rough division,
and like I said IANAO so I could well be wrong on the details (I've
ignored ontology annotations for a start). But maybe it'll help
determine what kind of ontology/data separation is needed.

Cheers,
Danny.

-- 

http://dannyayers.com

Received on Friday, 22 July 2005 10:28:53 UTC