- From: Houghton,Andrew <houghtoa@oclc.org>
- Date: Tue, 11 May 2004 13:05:08 -0400
- To: public-esw-thes@w3.org
> From: Leonard Will [mailto:L.Will@willpowerinfo.co.uk] > Sent: Tuesday, May 11, 2004 10:43 AM > Subject: Re: Supporting arrays of concepts > > 2. As regards node labels, I have tried to show that we need > to distinguish between > > (a) real "node labels", which specify a characteristic of > division in the form <xxx by yyy> and > > (b) broader concepts which act as parent terms to the > terms in a > following array. > > DDC centred headings and some of the AAT guide terms fall > under (b), and should not be called node labels. Structurally > these are just terms representing concepts which the > thesaurus editor has decided are unsuitable for use in > indexing (and may have to be labelled in some way to indicate this). I think I agree with your idea of separating the two. Maybe what is needed is another element at the same level as skos:Concept, perhaps skos:Summary, that handles (b) and the current proposal for handling (a). Although the current proposal seems odd to me. It seems to me that you might want to have additional metadata associated with node label array in addition to the list of concepts associated with it. For example scope notes or other types of notes. > > 3. The more complex issue that I thought would broaden the > scope of the project is handling pre-coordinated indexing > strings. The problem is described in the following extract > from "FAST : development of simplified headings for metadata > / by Rebecca J. Dean" > > [quote deleted] > > DDC and MeSH, similarly, have many provisions for > synthesising concepts to express compound concepts that may > or may not be enumerated in the schedules. A classification > schedule may show these compound concepts in a hierarchical > display, as I illustrated in the second example in my message > of 6th May, but the hierarchy is not built on the same BT/NT > relationships as in a thesaurus. I guess I didn't quite follow that. Yes, LCC, DDC, LCSH, MeSH and others allow synthesising concepts, but those concepts are still valid from the vocabularies perspective, its just that they have not been *enumerated* as a standard thesaurus does. So lets say I convert DDC into SKOS. What you would get is all the predefined concepts defined by the Dewey editors. If someone builds a class number, based upon the instructions in the classification, then they can merely create an skos:Concept element and within that element use the skos:inScheme to point to the "official" base scheme which defines the list of predefined concepts. Any built concept in LCC, DDC, LCSH, or MeSH participate in same BT/NT relationship established for the vocabulary. So I seem to be missing something with your analogy. > It seems to me to be a much more complex job for SKOS to try > to create a system that would incorporate rules for creating > these compound strings. You don't need to incorporate the rules for creating the compound strings. The "whole" compound string *is* the concept and there isn't necessarily a BT/NT relationship between the predefined part and what was composed. The whole term should be taken as the concept and its BT/NT relationship is to be taken in the context of all the other predefined or compound strings in the vocabulary. > The FAST project > <http://www.oclc.org/research/projects/fast/> from which I > quoted above recognises this problem by treating each of the > elements of an LCSH heading separately and grouping them into > subject, time, place, form, people and organisations facets. > This makes it much more amenable to storing in a structure > like SKOS, and seems the best initial approach. I disagree with this statement on several fronts. FAST is no more amenable to SKOS than any of the other vocabularies I mentioned. As a matter of fact I forgot to include FAST in the list of vocabularies that we will probably put into SKOS. FAST actually doesn't treat LCSH headings much differently than LCSH already does. LCSH is already faceted! LC may disagree with my statement, but the simple fact, or facet, of the matter is when we look at LCSH, which is defined by MARC21, the preferred term of the vocabulary is specified as the 1XX field. That XX should be a clue. Looking at the MARC21-A authorities format, you can see the following 1XX definitions: * 100 - HEADING--PERSONAL NAME (NR) * 110 - HEADING--CORPORATE NAME (NR) * 111 - HEADING--MEETING NAME (NR) * 130 - HEADING--UNIFORM TITLE (NR) * 148 - HEADING--CHRONOLOGICAL TERM (NR) * 150 - HEADING--TOPICAL TERM (NR) * 151 - HEADING--GEOGRAPHIC NAME (NR) * 155 - HEADING--GENRE/FORM TERM (NR) * 180 - HEADING--GENERAL SUBDIVISION (NR) * 181 - HEADING--GEOGRAPHIC SUBDIVISION (NR) * 182 - HEADING--CHRONOLOGICAL SUBDIVISION (NR) * 185 - HEADING--FORM SUBDIVISION (NR) To the naive, those pretty much look like facets. The 18X are used in building composed subject headings. So the "real" facets are everything else. FAST doesn't do anything radically different. They basically use the similar "facets" as LCSH has, but pull out some sub-facets under the existing LCSH pseudo-facets. Andy. Andrew Houghton, OCLC Online Computer Library Center, Inc. http://www.oclc.org/about/ http://www.oclc.org/research/staff/houghton.htm
Received on Tuesday, 11 May 2004 13:05:45 UTC