- From: Stella Dextre Clarke <sdclarke@lukehouse.demon.co.uk>
- Date: Thu, 8 Feb 2007 19:27:36 -0000
- To: "'Miles, AJ \(Alistair\)'" <A.J.Miles@rl.ac.uk>, <public-swd-wg@w3.org>
- Cc: <public-esw-thes@w3.org>
Alistair, I've just taken a look at your statement of the issue of representing A USE X AND Y as well as A USE X OR Y. The second half of it looks a very fair description of the problem. But I must protest that the Background section does not well reflect historical reality. The problem comes from your assumptions about what the thesaurus was invented for, i.e. "paper-based card catalogues" and the indexes derived therefrom. It gives the impression of catalogues like library card catalogues, in which you have a card per title (or occasionally more than one), with lots of cataloguing data on the card. Around the time when IR thesauri were invented, hopes were pinned on "mechanisation" rather than "computerisation", and a lot of experimentation went on with various sorts of cards. It is true that some agencies did try to use thesauri with "item cards" a bit like catalogue cards (only the more sophisticated ones were IBM punched cards or edge-punched cards), but the more successful ones used "feature cards". The key difference between these approaches is whether you assign a card to the document (item) being indexed, or to the thesaurus term (feature) that is used for indexing the documents. Optical coincidence cards were probably the most satisfactory sort of card for use with the thesaurus. You had the thesaurus itself, truly paper-based, a book that was consulted by indexers and searchers alike. Then you had a bank of strong cards, sometimes almost 2x2 ft in size, with a grid printed on each. Each card had the preferred term inscribed on the top left corner and used to alphabetise them. The card could have up to 100 rows and 100 columns, which made it capable of indexing a collection of 10,000 documents. Each document had a number. On indexing document 1234 with terms "Cats" and "Fur" you would get out those cards and punch a hole in each of them, in grid position 1234. Much later, when someone came to search for items about cats' fur, they would pull out those two cards and hold them up to a light. The light would shine through the holes in all the punched positions, including 1234. And then you went to the collection and pulled out document 1234 etc. The process I have just described is called postcoordinate retrieval and exactly the same principle is used for computer searches using Boolean AND. So the scenario for which thesauri were invented was not so very different from today's computer use. But because the thesaurus itself was paper-based, the issue of how to manage "A USE X AND Y" as well as "A USE X OR Y" did not really arise. These types of entry were simply typed on to a page (with a type-writer, if you were lucky) for use by humans. And even then, we frowned on "A USE X OR Y" as not very good practice. Despite the length of this message, I repeat that the main part of your statement is not affected by the Background. The problem with "A USE X AND Y" is not in the type of indexing system, but in the representation of the relationships within the thesaurus itself. And there I agree with you - the move from paper-based management to mechanised/computerised management does bring the problem to the fore. I'm just looking forward to the solutions you come up with! Cheers Stella ***************************************************** Stella Dextre Clarke Information Consultant Luke House, West Hendred, Wantage, Oxon, OX12 8RR, UK Tel: 01235-833-298 Fax: 01235-863-298 SDClarke@LukeHouse.demon.co.uk ***************************************************** -----Original Message----- From: public-esw-thes-request@w3.org [mailto:public-esw-thes-request@w3.org] On Behalf Of Miles, AJ (Alistair) Sent: 08 February 2007 17:21 To: public-swd-wg@w3.org Cc: public-esw-thes@w3.org Subject: [SKOS] thesaurus USE patterns Hi all, Please see the following: http://www.w3.org/2006/07/SWD/wiki/SkosDesign/ThesaurusPatterns?action=r ecall&rev=4 [DONE] ACTION: Alistair to raise a new issue about USE X + Y and USE X OR Y [recorded in http://www.w3.org/2007/01/23-swd-minutes.html#action07] Cheers, Alistair. -- Alistair Miles Research Associate CCLRC - Rutherford Appleton Laboratory Building R1 Room 1.60 Fermi Avenue Chilton Didcot Oxfordshire OX11 0QX United Kingdom Web: http://purl.org/net/aliman Email: a.j.miles@rl.ac.uk Tel: +44 (0)1235 445440
Received on Thursday, 8 February 2007 19:27:58 UTC