Use case: Arkive

As I mentioned in the telecon this use case is not my own
but that of another project team in HP Labs.
Hence, in parts, I will be making mistakes.
I would value feedback as to the appropriateness of this
use case.

The arkive project is creating a multimedial database consisiting of
a record for each endangered species.

The database aims at completeness, with enough appropriate information
for each species.

The database is accessed through a web site and targetted at users at 
all levels of expertise: ranging from school children through to
domain expert.

The key functions of ontological knowledge are:
+ to allow consistent organization of each species record
+ to provide a means for ensuring that each species record is
  sufficiently detailed, and includes examples of each important
  behaviour.
+ to help with query across the database

Other functions where ontological knowledge maybe useful include
organising annotations and providence of knowledge.

We note that:
- despite the relevant science having had about two centuries of
  debate there is no universal agreement about appropriate 
  ontologies for full and adequate species descriptions.
- the number of species suggests that globally a federated solution
  is needed. The British participants have funding to make records
  of all British species, and the top N globally endangered species.
  The long-term plan would be to have people world-wide contributing
  records for their local species. This is likely to exacerbate the 
  lack of agreement about the underlying ontologies.


  TASK:
Organising, and commisioning multimedia records of 
endangered species.

  EXAMPLE DOMAIN:
multimedia records of endangered species.

  TYPICAL USERS:
1: scientist making a specific record.
2: manager commissioning new records.
3: scientist querying DB through web-site
4: school child querying DB through web-site

  ONTOLOGY SAMPLES:
I will need to get back to my informant for better data.
I rapidly get out of my depth biologically in this point
in the presentations I have seen.

Currently they use about ten master record-templates for the
different top-level categories. 
For example, there is typically no "locomotion" field for
plants, but it is of interest for animals.

These top-level categories are necessarily insufficient in
that they cover (only) the general types of behaviour.
Any unique or rare behaviour of a species is:
 + important to include in the record
 + not in the top-level category
also such behaviours are subject to scientific debate.
(A concrete example was to do with birds that pick up
poisionous insects in their beaks and rub them against their
feathers. It is contentious whether they do this:
+ to get high
+ to kill off parasites in their feathers
The name you use for the behaviour depends on your judgement
on its motivation; which may well depend on your political persuasion.)

There are also some behaviours whcih have multiple different
names that are synonymous.

Default inheritance is important. The well known penguins issue:
   living things don't fly
   birds         do    fly
   penguins      don't fly

This can be addressed when first creating a record, when default 
values can be filled in, to be changed if necessary, or more 
dynamically.

It is important to relate the category information back to
multiple (partially inconsistent) taxonomies in the field.


  WEBONT REQUIREMENTS

Hard to say - there are a range of knowledge base requirements,
which ones actually belong to the ontological subsystem is
problematic.

- Hierarchical classes with inheritance of properties, 
  default values, etc. Probably single inheritance would 
  suffice.
- Providence: to distinguish facts that are in the
  specific record, from later annotations by experts or
  non-experts, from inherited facts etc.
- Query support. Query may be guided by category information,
  and possibly by falsehoods (e.g. "whales are fish" may be 
  useful to help small children search, who might otherwise
  conclude there are no whales in the DB)
  Mixed mode query - both free text and category information.
- Multiple synonymous labels for properties and values.
  Theasural support.
- Ability to extend ontology on the fly, in a distributed 
  fashion. (Experts adding framework to describe the special 
  behaviour of their species).







Jeremy Carroll

Received on Friday, 30 November 2001 17:45:26 UTC