- From: Alistair Miles <a.j.miles@rl.ac.uk>
- Date: Thu, 08 Feb 2007 19:47:03 +0000
- To: Stella Dextre Clarke <sdclarke@lukehouse.demon.co.uk>
- CC: public-swd-wg@w3.org, public-esw-thes@w3.org
I've added a caveat and a link to Stella's entirely superior background information: http://www.w3.org/2006/07/SWD/wiki/SkosDesign/ThesaurusPatterns?action=recall&rev=6 Thanks Stella :) Alistair. Stella Dextre Clarke wrote: > Alistair, > I've just taken a look at your statement of the issue of representing A > USE X AND Y as well as A USE X OR Y. The second half of it looks a very > fair description of the problem. But I must protest that the Background > section does not well reflect historical reality. The problem comes from > your assumptions about what the thesaurus was invented for, i.e. > "paper-based card catalogues" and the indexes derived therefrom. It > gives the impression of catalogues like library card catalogues, in > which you have a card per title (or occasionally more than one), with > lots of cataloguing data on the card. > > Around the time when IR thesauri were invented, hopes were pinned on > "mechanisation" rather than "computerisation", and a lot of > experimentation went on with various sorts of cards. It is true that > some agencies did try to use thesauri with "item cards" a bit like > catalogue cards (only the more sophisticated ones were IBM punched cards > or edge-punched cards), but the more successful ones used "feature > cards". The key difference between these approaches is whether you > assign a card to the document (item) being indexed, or to the thesaurus > term (feature) that is used for indexing the documents. > > Optical coincidence cards were probably the most satisfactory sort of > card for use with the thesaurus. You had the thesaurus itself, truly > paper-based, a book that was consulted by indexers and searchers alike. > Then you had a bank of strong cards, sometimes almost 2x2 ft in size, > with a grid printed on each. Each card had the preferred term inscribed > on the top left corner and used to alphabetise them. The card could have > up to 100 rows and 100 columns, which made it capable of indexing a > collection of 10,000 documents. Each document had a number. > > On indexing document 1234 with terms "Cats" and "Fur" you would get out > those cards and punch a hole in each of them, in grid position 1234. > Much later, when someone came to search for items about cats' fur, they > would pull out those two cards and hold them up to a light. The light > would shine through the holes in all the punched positions, including > 1234. And then you went to the collection and pulled out document 1234 > etc. > > The process I have just described is called postcoordinate retrieval and > exactly the same principle is used for computer searches using Boolean > AND. So the scenario for which thesauri were invented was not so very > different from today's computer use. But because the thesaurus itself > was paper-based, the issue of how to manage "A USE X AND Y" as well as > "A USE X OR Y" did not really arise. These types of entry were simply > typed on to a page (with a type-writer, if you were lucky) for use by > humans. And even then, we frowned on "A USE X OR Y" as not very good > practice. > > Despite the length of this message, I repeat that the main part of your > statement is not affected by the Background. The problem with "A USE X > AND Y" is not in the type of indexing system, but in the representation > of the relationships within the thesaurus itself. And there I agree with > you - the move from paper-based management to mechanised/computerised > management does bring the problem to the fore. I'm just looking forward > to the solutions you come up with! > > Cheers > Stella > > > ***************************************************** > Stella Dextre Clarke > Information Consultant > Luke House, West Hendred, Wantage, Oxon, OX12 8RR, UK > Tel: 01235-833-298 > Fax: 01235-863-298 > SDClarke@LukeHouse.demon.co.uk > ***************************************************** > > > > -----Original Message----- > From: public-esw-thes-request@w3.org > [mailto:public-esw-thes-request@w3.org] On Behalf Of Miles, AJ > (Alistair) > Sent: 08 February 2007 17:21 > To: public-swd-wg@w3.org > Cc: public-esw-thes@w3.org > Subject: [SKOS] thesaurus USE patterns > > > > Hi all, > > Please see the following: > > http://www.w3.org/2006/07/SWD/wiki/SkosDesign/ThesaurusPatterns?action=r > ecall&rev=4 > > [DONE] ACTION: Alistair to raise a new issue about USE X + Y and USE X > OR Y [recorded in > http://www.w3.org/2007/01/23-swd-minutes.html#action07] > > Cheers, > > Alistair. > -- > Alistair Miles > Research Associate > CCLRC - Rutherford Appleton Laboratory > Building R1 Room 1.60 > Fermi Avenue > Chilton > Didcot > Oxfordshire OX11 0QX > United Kingdom > Web: http://purl.org/net/aliman > Email: a.j.miles@rl.ac.uk > Tel: +44 (0)1235 445440 > > > > -- Alistair Miles Research Associate CCLRC - Rutherford Appleton Laboratory Building R1 Room 1.60 Fermi Avenue Chilton Didcot Oxfordshire OX11 0QX United Kingdom Web: http://purl.org/net/aliman Email: a.j.miles@rl.ac.uk Tel: +44 (0)1235 445440
Received on Thursday, 8 February 2007 19:47:26 UTC