- From: Butler, Mark <Mark_Butler@hplb.hpl.hp.com>
- Date: Tue, 16 Sep 2003 18:00:11 +0100
- To: www-rdf-dspace@w3.org
Hi team, First thanks Kevin for suggestions. Yes more detail would help. I've been trying to brainstorm around these concepts - feedback or other suggestions are welcome? 1: MAP VOCABULARIES USING INFERENCE AT QUERY TIME Here we use schemas to map between different properties and classes in different vocabularies at query time. This is the proposal that SIMILE has been considering. One of the points of this email is to enumerate some other approaches. 2: MAP VOCABULARIES USING INFERENCE IN ADVANCE This is similar to 1, only we save the inference information back to the repository to reduce query time. Whether this is practical depends on the time / memory tradeoff e.g. do we gain an advantage in time which it is worth sacrificing the memory required to cache the inference information. 3: MAPPING VOCABULARIES TO FRBR GROUPS Based on the FRBR concepts, one way to think of FRBR is like a highly simplified classification system with just 3 types of properties e.g. we can map other vocabularies on to these three properties: group 1 = dc:title, dc:identifier group 2 = dc:publisher, dc:creator, dc:contributor group 3 = dc:description, dc:relation, dc:subject, dc:coverage Note we don't have to map all the terms into a vocabulary, some terms (for example technical metadata) won't map at all. How do we use this and what advantage does it provide over free text search? Well we can: - use authority files to automatically identify group 2 items or clean-up manually identified group 2 items - use thesauri to identify group 3 items and map between synonyms - I suspect it's harder to come up with automatic ways of dealing with group 1 items as they is going to be very little repetition in comparison to group 2 or 3. If we are dealing with digital objects then as David Karger has noted we could use hashing to come up with a unique identifer for an object and then merge the various other group 1 descriptors used for it to create a synonym description. However this requires that we either determine which properties supply group 1 information in a given vocabulary, or we require the user to manually identify it. 4: EXTRACTING AND TAGGING FRBR GROUPS IN METADATA However this is based on the assumption that each property maps onto 1 and only 1 group. This is probably not true, for example consider a "Frank Lloyd Wright: A Biography". This is a reference to a group 1 surrogate, but contains a reference to a group 2 surrogate. Therefore an alternative is to think about them as data-types rather than properties e.g. vra.creator.personal name = <group2>Frank Lloyd Wright</group2> dc.title = <group1>A biography of <group2>Frank Lloyd Wright</group2></group1> ims.general.title = <group2>Frank Lloyd Wright</group2> 5: ADDING FRBR GROUP INFORMATION Rather than mapping between vocabularies, we could try to create FRBR information in a new vocabulary when we ingest a record. We could use some algorithms to try to do this automatically, but in a user supervised way. 6: SYNTHESISE AND ANNOTATE DC INFORMATION Here when we ingest records we use some automatic mapping rules to do a crosswalk to create DC information from the metadata. The results of this crosswalk are then presented to a user, either one at a time or in a batch view for them to inspect. They are then given the chance to alter this information, at which time it is saved to the repository. (However this breaks due to the examples in the demo script). Comments, feedback please? Dr Mark H. Butler Research Scientist HP Labs Bristol mark-h_butler@hp.com Internet: http://www-uk.hpl.hp.com/people/marbut/
Received on Tuesday, 16 September 2003 13:22:31 UTC