- From: Leonard Will <L.Will@willpowerinfo.co.uk>
- Date: Thu, 4 Dec 2008 15:52:47 +0000
- To: Alistair Miles <alistair.miles@zoo.ox.ac.uk>
- Cc: "Tudhope D S (AT)" <dstudhope@glam.ac.uk>, public-swd-wg@w3.org, "Binding C (AT)" <cbinding@glam.ac.uk>, public-esw-thes@w3.org
- Message-ID: <3RadOBJPz$NJFARu@mail.willpowerinfo.co.uk>
On Thu, 4 Dec 2008 at 11:56:00, Alistair Miles <alistair.miles@zoo.ox.ac.uk> wrote >Take the AAT sample data, for example. > >You quite nicely describe the metamodel underlying the AAT data, where >the data is structured as Records of one of four types (Concept, Facet, >GuideTerm, HierarchyName), and where parent/child relationships can >exist between Records of any type. > >You are still left with an open choice about how to define a >transformation which will map this metamodel onto the SKOS data model. > >For example, your transformation could be as follows: for each AAT >Record, generate an instance of skos:Concept, regardless of the type of >the Record; for each parent/child relationship between AAT Records, >generate a triple X skos:broader Y. > >Alternatively, your transformation could be as follows: for each AAT >Record of type Concept, generate an instance of skos:Concept. For each >AAT Record of type Concept, walk up the parent/child relationships >until you find another AAT Record of type Concept, and generate a >triple X skos:broader Y. > >These are not complete descriptions of each transformation, but I hope >they illustrate the point that the AAT metamodel *does not constrain >you* with respect to how you represent the same data as SKOS. Just >because there is a "parent/child" relationship between "records" in the >AAT data, doesn't mean you must generate a triple X skos:broader Y in >the SKOS representation. I agree with Alistair's second option for the transformation required. Some software or communication formats may, for simplicity of manipulation and display, imply broader/narrower relationships between elements which according to the standards for thesaurus construction should not have such a relationship. So long as the nature of the elements can be distinguished by some other means, a more accurate interpretation should be used when importing the thesaurus into software which supports it, such as SKOS. AAT is a particularly tricky example, because they use the expression "Guide terms" to include both "node labels specifying a characteristic of division" and "labels for concepts which should not be used for indexing", but which nevertheless occupy a valid place in the BT/NT hierarchy. There is also a risk of confusion between "facets" (groups of concepts of the same inherent category) and "facet labels" (a.k.a. "node labels containing the name of a facet") which specify what facet subsequent concepts belong to, in a classified display. As far as I can see, the SKOS format does not properly represent the structure in the BS DD8723-5 draft standard, as "collections" in SKOS do not directly correspond to "arrays" in the BS model. It may be of interest to SKOS people to know that we have continued to develop the UML model while working on the ISO version of the standard, ISO 25964; although based on the BS model this has some additional features and changes. I attach a copy of the model incorporating the latest thinking of the ISO working party, and it would be good if any SKOS development could use this rather than the BS draft. Some of the labels have been changed to make it easier to transform into OWL - thanks to Bernard Vatant for this. The revised model contains a new element - the "ConceptGroup", which we have explained as follows: "Many thesauri group concepts using a classification structure which exists in parallel to the hierarchies of thesaurus concepts based on BT/NT relationships. Groups created by the classification are often based on disciplines, subject areas or areas of business activity. They are sometimes called "subject categories", "themes", "domains", "groups" or "microthesauri". The model provides for all of these by providing the classes ConceptGroup and ConceptGroupLabel and the specific type may be indicated by the attribute conceptGroupType. There is not, in general, a BT/NT relationship between a ConceptGroup and the concepts which it contains. Concepts may be gathered into ConceptGroups from many different facets or hierarchies of the thesaurus, and the notation used for the classification into groups may be quite distinct from any notation that may be used for the concepts themselves. Groups may have subgroups, being nested to any level. Each group should be given one verbal label per language." [I'm sorry that due to irritating ISO restrictions I cannot make the full draft available at this stage, so I hope I can get away with the above quotation from the notes on the model. I'll explain any other points that are not clear, if asked.] This provision for a loose grouping of concepts relevant to a subject area in fact seems closer to SKOS "collections" than the more strictly defined "arrays", which are groups of sibling concepts. We would really like SKOS to provide for this distinction. "Concepts which should not be used for indexing" can be indicated by giving them an appropriate custom attribute or note, such as "Use a more specific concept if possible" (I prefer this less restrictive note, as there are cases where such a term can be useful, especially when searching for it and all its narrower terms). Leonard Will -- Willpower Information (Partners: Dr Leonard D Will, Sheena E Will) Information Management Consultants Tel: +44 (0)20 8372 0092 27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276 L.Will@Willpowerinfo.co.uk Sheena.Will@Willpowerinfo.co.uk ---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------
Attachments
- image/jpeg attachment: ISO_model_2008-11-18.jpg
Received on Thursday, 4 December 2008 15:53:46 UTC