- From: Leonard Will <L.Will@willpowerinfo.co.uk>
- Date: Fri, 14 Dec 2007 20:54:46 +0000
- To: "public-swd-wg@w3.org" <public-swd-wg@w3.org>, "public-esw-thes@w3.org" <public-esw-thes@w3.org>
On Fri, 14 Dec 2007 at 13:40:47, Ed Summers <ehs@pobox.com> wrote >The filenames are reflective of the "seed" concept that I used to walk >about a certain radius from. Well and it's Friday... I'd be interested >to hear where you see the broader relationship not working. Ed - I agree with Joe Tennis that you will have difficulties in avoiding inconsistencies if you take the BT/NT relationships in LCSH as equivalent to the way they are applied in thesauri developed in accordance with standards. (To be fair, I believe that the LCSH people are well aware of this, and are working gradually towards standard conformity, but they have such a load of existing material to deal with that it will take some time.) You asked for any invalid relationships in your diagrams. I have not checked all the XML, but I take it that the arrows in the diagrams are intended to point from a narrower concept to a broader one. Many of the relationships do indeed appear to be valid, but a few invalid ones I spotted are: non-alcoholic cocktails > cocktails > alcoholic beverages malt > beer [is malt really a kind of beer? If this is a part-whole relationship, it would be acceptable only if malt occurred only as a part of beer, which is not the case.] malt-extracts > beer malt-extracts > malt > beer [as above, and the second of these is redundant and would not be accepted in a thesaurus if the first was present] malt > brewing [these concepts are in different facets, "material" and "activities" and so cannot be hierarchically related] wine and wine making [LCSH often combines terms from an activity facet with terms from materials or objects facets like this. They really need to be separated out in order to create valid thesaurus hierarchies.] WebTV (trademark) > World Wide Web [you might get away with WebTV being _part_ of the WWW, and perhaps the qualifier "trademark" is just the legal "TM" symbol spelled out, but if the concept is the trademark, as it appears, it is not a true hierarchical relationship. Part/whole relationships are best kept to a few specific situations, spelt out in the standards.] semantic web > semantic integration (computer systems) [mixed facets - a system and an activity or property] semantic integration (computer systems) > integrated software [mixed facets - an activity or property and an intellectual product or document] It is not possible to be definite about some of these; I have not checked all the scope notes, where they exist, but there do appear to be prima facie problems. I suppose we are looking only at the relationships at the moment - many of the terms are not formatted as thesaurus standards would have them, but that is another issue, e.g. cereals, prepared [thesaurus terms are not inverted like this] motion pictures and rock music [a compound concept, presumably meaning "motion pictures in combination with rock music", not just the whole of both concepts viewed separately] gays and rock music [another compound of two concepts] piano - studies and exercises (rock) [a combination of three concepts: instrument, form and style] I cannot comment on most of the popular music examples, as I would not recognise most of them if I heard them. I wonder if anyone can write clear scope notes to distinguish one from another! Your examples do show the need for some grouping into arrays with node labels to show characteristics of division. These would organise sibling terms more clearly, e.g. (music by instrument), or (plant products by source plant). There are also some Thanks for the interesting Friday exercise! Leonard -- Willpower Information (Partners: Dr Leonard D Will, Sheena E Will) Information Management Consultants Tel: +44 (0)20 8372 0092 27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276 L.Will@Willpowerinfo.co.uk Sheena.Will@Willpowerinfo.co.uk ---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------
Received on Friday, 14 December 2007 20:58:35 UTC