- From: Aida Slavic <aida@acorweb.net>
- Date: Wed, 2 Mar 2005 22:14:26 -0000
- To: <public-esw-thes@w3.org>
>Not sure I understand where this conversation is leading. I absolutely >agree with all Leonard's points about using different metadata elements >to describe the "subject" and the "type" of a document. Other important The issue of redundancy of subject metadata is a bit out of place here although I personally find it very relevant. In my mind this is not the question of standards but rather an issue of specific IR system implementation and indexing and cataloguing guidelines. Standards should remain generous and open to redundancy to a reasonable extent. In distributed resources discovery (cross collection/cross language) redundancy may be seen as a 'safety margin' and may help reduce information loss. Especially with the respect to the very fuzzy understanding of semantics of subject elements in DCMES/LOM and other metadata schemes around. The reason MARC21 has subject data scattered in several fields comes from the bad data modelling and bad tradition in library formats not making proper use of subject data. Functional Requirements for Bibliographic Data make the following distinction between different entities of information resources (document) WORK (intellectual creation) is realised through EXPRESSION (intellectual realization) is embodied in MANIFESTATION (physical embodiment), is exemplified by ITEM (a single exemplar of manifestation) WORK (M-M relationships) hasSubject - concept, event, object, place hasSubject - person, corporate body hasSubject - work, expression, manifestation, item Worth considering for SKOS is that what is only a form of Work1 may become subject of study for Work2 in the same way the author of Work1 may become subject of Work2. Classifications systems usually make clear distinction between resource content and resource form/type/format and provide possibility to index all four of them separately without any confusion on what is subject and what is form/type/format All classification systems created for information organization and retrieval contain two different kind of vocabulary and in more sophisticated systems these are kept in separate vocabulary facets and can be easily managed and accessed separately in IR. 1) facets of subject fields/discipline vocabulary - often called main tables or main schedules) 2) facets of common concepts (form,place,time, persons, properties, processes...) - often called auxiliary tables or schedules In synthetic classifications common concepts represent up to 1/5 of the whole classification vocabulary and they are freely combined with any subject. In the vocabulary of FORM - one can find well organized hierarchies of vocabulary that can be used to denote different internal forms (e.g. teaching aid) and external forms (e.g. text) defend formats (e.g. digital) and different carriers(e.g. Web page). The reason for this is the fact that in information organization - classifications are used for practical purpose of collocating information resources according to their content as well as the form in which the content is expressed. So one can choose to collocate all maps and within this class to make distinction between history, politics, demography etc. Or collocate history documents and within this subject area organize maps, textbooks, videos etc. It is a matter of indexing policy and specific needs of a resource collection and IR system whether the attributes of form, place, language etc. will be added to the main subject concept in the process of classification. Aida
Received on Wednesday, 2 March 2005 22:15:13 UTC