- From: Guus Schreiber <schreiber@swi.psy.uva.nl>
- Date: Thu, 13 Dec 2001 11:09:10 +0100
- To: WebOnt WG <www-webont-wg@w3.org>
Dear WG members, Encl. is a first sketch of the collection management area. Due to lack of time I have not been able to circulate this full text among the other area people yet (Mike, Stephen and Nick: my apologies). Guus -------------------------------------------------------------- USE-CASE AREA: Collection Management Version: 13 december, 2001 General structure of the use-case area description - definition/scope of the area: tasks, typical domains, users - links to other areas - resulting list of language requirements arising from use cases - 3-4 detailed use-case descriptions --------------------------- SCOPE AND DEFINITION --------------------------- Characteristics: * Large data/text/image/multimedia/website sets with a common theme/context/focus * fixed set of items in archive/collection * can be very large set => scalability issues * typically domain specific, therefore linked to (traditional) work on domain standards * focus is on metadata => link to traditional metadata Collection-management subtasks: * item indexing/annotation/classification * collection updates * collection search * often involves default reasoning --------------------------- LINKS TO OTHER AREAS/ISSUES/TASKS --------------------------- 1. Virtual catalogs Examples: - virtual museum (several projects) - product search/comparison sites (e.g., Lynn Stein's book identification, Mike Dean's hotels) . There is a clear link jere to the "interoperability" area. Virtual catalogs typically requires ontology-mapping stuff. Also, it makes the collection management task different as less assumptions can be made about the collection (e.g., its size). 2. Service catalogs These are mentioned in a number of use cases. With respect to the declarative aspects of service description and search, there is a clear link between "web services" and this area. <to be worked out> 3. Presentation generation Semantically annotated catalogs are an ideal substrate for (context-specific) generation of presentations c.q. web pages. Example: dynamic configuration of a web page for browsers of an art catalog, showing related texts and images. Example: - work Lynda Hardman (CWI, Amsterdam) - open hypermedia?!: generation of links based on ontologies (Nick Gibbins, Southampton) 4. Conceptual search In conceptual search we would like to view the whole web as one indexed catalog. This seems to be a bridge too far at the moment, given the problems we still have a domain-specific catalogs. A realistic scenario for the short-term conceptual search is a two-step process: 1. use an Open-Directory like mechanism to constrain your search to an area which hopefully provides some archives/catalogs 2. use the semantic search engines of the catalogs to find an answer to your query. 5. Content standards Due to the domain specificity of catalogs, many of them require a clear link with domain standards/vocabularies (existing or under development). These domain standards were typically developed to support manual indexing. Also, more general resources such as WordNet are being used. --------------------------- RESULTING LANGUAGE REQUIREMENTS --------------------------- Some preliminary examples (numbers refer to use cases below): - default knowledge / default reasoning (1, 2, 4) - constraints (2, 3, 4) - consistency rules (3) - some notion of aggregation (3, 4) - statements at class and at instance level (2, 3, 4) - (WebOnt representation of thesauri / domain standards: AAT, TGN, WordNet) (1, 4) --------------------------- USE-CASE DESCRIPTIONS --------------------------- 1. Arkive: catalog of endangered species descriptions Contributor: Jeremy Carrol, HP <cut&paste of Jeremy's email> The arkive project is creating a multimedial database consisiting of a record for each endangered species. The database aims at completeness, with enough appropriate information for each species. The database is accessed through a web site and targetted at users at all levels of expertise: ranging from school children through to domain expert. The key functions of ontological knowledge are: + to allow consistent organization of each species record + to provide a means for ensuring that each species record is sufficiently detailed, and includes examples of each important behaviour. + to help with query across the database Other functions where ontological knowledge maybe useful include organising annotations and providence of knowledge. We note that: - despite the relevant science having had about two centuries of debate there is no universal agreement about appropriate ontologies for full and adequate species descriptions. - the number of species suggests that globally a federated solution is needed. The British participants have funding to make records of all British species, and the top N globally endangered species. The long-term plan would be to have people world-wide contributing records for their local species. This is likely to exacerbate the lack of agreement about the underlying ontologies. TASK: Organising, and commisioning multimedia records of endangered species. EXAMPLE DOMAIN: multimedia records of endangered species. TYPICAL USERS: 1: scientist making a specific record. 2: manager commissioning new records. 3: scientist querying DB through web-site 4: school child querying DB through web-site ONTOLOGY SAMPLES: I will need to get back to my informant for better data. I rapidly get out of my depth biologically in this point in the presentations I have seen. Currently they use about ten master record-templates for the different top-level categories. For example, there is typically no "locomotion" field for plants, but it is of interest for animals. These top-level categories are necessarily insufficient in that they cover (only) the general types of behaviour. Any unique or rare behaviour of a species is: + important to include in the record + not in the top-level category also such behaviours are subject to scientific debate. (A concrete example was to do with birds that pick up poisionous insects in their beaks and rub them against their feathers. It is contentious whether they do this: + to get high + to kill off parasites in their feathers The name you use for the behaviour depends on your judgement on its motivation; which may well depend on your political persuasion.) There are also some behaviours whcih have multiple different names that are synonymous. Default inheritance is important. The well known penguins issue: living things don't fly birds do fly penguins don't fly This can be addressed when first creating a record, when default values can be filled in, to be changed if necessary, or more dynamically. It is important to relate the category information back to multiple (partially inconsistent) taxonomies in the field. WEBONT REQUIREMENTS Hard to say - there are a range of knowledge base requirements, which ones actually belong to the ontological subsystem is problematic. - Hierarchical classes with inheritance of properties, default values, etc. Probably single inheritance would suffice. - Providence: to distinguish facts that are in the specific record, from later annotations by experts or non-experts, from inherited facts etc. - Query support. Query may be guided by category information, and possibly by falsehoods (e.g. "whales are fish" may be useful to help small children search, who might otherwise conclude there are no whales in the DB) Mixed mode query - both free text and category information. - Multiple synonymous labels for properties and values. Theasural support. - Ability to extend ontology on the fly, in a distributed fashion. (Experts adding framework to describe the special behaviour of their species). 2. EDS web page landfill Contribitor: Miker Smith, EDS TASK: Organizing a massive web page land-fill into hierarchical categories in support of corporate communication and corporate memory. EXAMPLE DOMAIN: External press releases, product offerings and case studies, corporate procedures, internal product briefings and comparisons, white papers, and offering process descriptions. TYPICAL USER: Salesperson looking for sales collateral relevant to a client's expressed interest. Technical person looking for pockets of specific technical expertise and detailed past experience. ONTOLOGY SAMPLES: Document type hierarchy: Press release <- press release covering financial details <- press release detailing SEC filings .... Solution descriptions that include part-whole relations and constraints covering software, hardware, and communication compatibility. WEBONT REQUIREMENTS: Defaults and constraints. Language neutral representation. Instances distinct from classes. We need a clean interface between Web Ontologies and more mainstream business and manufacturing XML standards. 3. Aerospace Engineering Data Modelling Contributor: Stephen Buswell, Stilo <cut&paste of Stephen's email> We are building an ontology capable of describing the objects involved in an aerospace engineering development project, and associating them with the reference documentation which defines the objects. The objects may be related to each other in several ways: parent-child relationships - a wing assembly has as child elements a wing, a wing spar, an engine ... [NOTE: I have interpreted this as a form of aggregation. Guus] constraints - the length property of the wing spar is constrained by the length property of the wing illustrations - an object in the ontology may be portrayed by a diagram in a source document All aircraft are individually different, so we need to be able to make statements about instances of classes as well as about the classes themselves. We also have consistency rules which enable us to say things like "the existence of a member of class A entails the existence of a member of class B" We are experimenting with semantic web technologies (RDF, DAML and extensions) to model the data in order to support visualisation and navigation based on object relationships (rather than physical or documentation structure) and support reliable transformations between different representations of the data. We are particularly interested in the languages themselves (for modelling and expressing logical relationships) and in graphical and other techniques for visualising and navigating the ontologies. 4. Art-image collections Contributor: Guus Schreiber, Ibrow / University of Amsterdam TASK: searching a digital image collection EXAMPLE DOMAIN: museum collection of images of antique furniture TYPICAL USER: lay person with some basic knowledge of the domain, looking for some piece of antique ONTOLOGY SAMPLES: The basis of our ontology is formed by the Art and Architecture Thesaurus (AAT) [1] constructed by the Getty Foundation, which provides a highly structured hierarchy of some 120.000 terms to describe art objects (art categories, materials, styles, color, ....). We also have a description template for antique furniture based on the VRA 3.0 standard [2], which is basically a refinement of Dublin Core for art-image annotation Let's for the moment assume we can represent AAT and VRA in WebOnt. For effective search support we need to add domain knowledge to this ontology. This knowledge typically takes the form of inter-slot constraints within the image description template. One example: style/period = "Late Georgian" => culture = "British" AND date.created = 1760, 1811 [Style/period, culture and date.created are all VRA data elements defined as slots for our art-object description template.] We could not define this constraint in RDFS and (a little to our surprise) we saw no way of expressing it in DAML+OIL either (we could have misread the spec, we would be glad to be proven wrong). This type of semantical information is essential to show added value of semantic annotations. WEBONT REQUIREMENTS possibility to define inter-slot constraints of a class [1] The Art and Architecture Thesaurus http://shiva.pub.getty.edu. [2] Visual Resources Association~Standards Committee. VRA core categories, version 3.0. Technical report, Visual Resources Association, July 2000. http://www.gsd.harvard.edu/~staffaw3/vra/vracore3.htm. -- A. Th. Schreiber, SWI, University of Amsterdam, Roetersstraat 15 NL-1018 WB Amsterdam, The Netherlands, Tel: +31 20 525 6793 Fax: +31 20 525 6896; E-mail: schreiber@swi.psy.uva.nl WWW: http://www.swi.psy.uva.nl/usr/Schreiber/home.html
Received on Thursday, 13 December 2001 05:11:26 UTC