RE: Ontology browsing from Butler, Mark on 2003-10-27 (www-rdf-dspace@w3.org from October 2003)

From: Butler, Mark <Mark_Butler@hplb.hpl.hp.com>
Date: Mon, 27 Oct 2003 12:10:42 -0000
To: "'Karun Bakshi'" <karunb@ai.mit.edu>, SIMILE public list <www-rdf-dspace@w3.org>
Message-ID: <E864E95CB35C1C46B72FEA0626A2E808206210@0-mail-br1.hpl.hp.com>
Hi Karun

> Also, speaking of versioning, I have been wanting to develop 
> an ontology
> server that you can ask for an ontology with a particular version, and
> it ships you the XML representation of the ontology.  I'm not sure how
> this fits into your plans, but just thought I'd throw it out to see if
> you have thought of it or if there exists a standard protocol 
> for it.  I
> have not mentioned it to Prof. Karger yet, but I think we will/should
> need it soon.

I think there are lots of useful enhancements that ontology servers can
offer. John Gilbert, a summer intern who was working at me was looking at
this. I'll try to outline our thinking here: 

1. When communities develop new schemas, it would be preferable for them to
adopt schema re-use rather than schema creation where possible as this
simplifies interoperability. When a community re-uses a schema in this way
it is commonly known as an application profile. 

(An aside: for relevant background, see the MEG and CORES work:
http://www.ukoln.ac.uk/metadata/education/regproj/
http://www.cores-eu.net
http://www.ukoln.ac.uk/metadata/education/regproj/scart/
http://www.ariadne.ac.uk/issue25/app-profiles/
Although the MEG project is finished, there is a Sourceforge project
involving the same developers - it hasn't released any files yet, but they
do have code in the CVS
http://sourceforge.net/projects/schemas)

2. In order to facilitate schema re-use, schema authors need two types of
information
i) english descriptions that ground the formal terms used in the schema.
Such descriptions are not expressable in a machine readable format.
ii) relationships between the formal terms e.g. property and class
definitions, relationship hierarchies etc. 

3. Therefore we hypothesise that schemas should contain human readable
descriptions as well as machine readable information wherever possible.

4. Re-use requires effective searching of the schemas, but different search
techniques are appropriate for the human and machine readable sections. John
G proposed adding additional metadata to the schema (metametametadata) about
grounding. However I'm concious that adding metadata has a cost, and getting
schema authors to add information so elements of their schemas are re-used
may be difficult - I'd prefer to see them concentrate on getting their
schemas right. Therefore we propose faceted browsing and hyperlinking of
related terms are useful search mechanisms for exploring the machine
readable sections, whereas free text search is appropriate for the human
readable sections. It is fairly easy to construct a hybrid architecture that
supports this by combining an RDF store like Jena and a text indexing system
like Lucene - see slide 7 of
http://lists.w3.org/Archives/Public/www-rdf-dspace/2003Sep/att-0055/sw_tool_
investigation.pdf

5. John did some work on a demonstration of this hybrid architecture - for
screen shots see 
slides 8 and 9 in the above presentation. 

I'd like to make this demo available, its on the to-do list, it's just due
to the way the OntologyDocumentManager works in Jena 2 it's quite hard to
make it work in a portable way on J2EE application servers. After some
discussion with Ian Dickinson who is responsible for this code in Jena, I've
concluded the best way to resolve this is to write a custom OntologyManager,
but I haven't had time to do this yet. 

Dr Mark H. Butler
Research Scientist                HP Labs Bristol
mark-h_butler@hp.com
Internet: http://www-uk.hpl.hp.com/people/marbut/
Received on Monday, 27 October 2003 07:21:22 UTC