W3C home > Mailing lists > Public > public-esw-thes@w3.org > October 2005

RE: pre- and post- coordinate indexing

From: Miles, AJ \(Alistair\) <A.J.Miles@rl.ac.uk>
Date: Wed, 19 Oct 2005 12:19:33 +0100
Message-ID: <677CE4DD24B12C4B9FA138534E29FB1D0ACDFE@exchange11.fed.cclrc.ac.uk>
To: "Leonard Will" <L.Will@willpowerinfo.co.uk>, <public-esw-thes@w3.org>, "Stella Dextre Clarke \(E-mail\)" <SDClarke@lukehouse.demon.co.uk>, "Ron Davies \(E-mail\)" <ron@rondavies.be>

> Secondly, I'm *guessing* that under pre-coordinate indexing, 
> an indexer could make the following two types of indexing 
> assignment (inventing my own syntax):
> doc | subject
> ----------------------------------
> 1   | cut flowers, crop production
> 2   | cut flowers + crop production
> In the first assignment, the indexer wishes to state that the 
> subjects of document 1 are cut flowers, and crop production, 
> although not necessarily the production of cut flowers.  In 
> the second assignment, the indexer explicitly wishes to state 
> that the subject of document 2 is (cut flowers + crop 
> production) i.e. cut flower production.
> How does the searcher then distinguish between these two 
> statements?  I'm guessing that under traditional search 
> systems, a boolean search string such as 'cut flowers AND 
> crop production' will not be able to distinguish between the 
> two statements (because it's implemented via some sort of 
> sub-string comparison), and will return both documents, is 
> that correct?  Is this something like the problem of 'false 
> hits' that you mentioned previously Leonard?  If not, can you 
> describe the problem of 'false hits' that you mentioned?

Or is the problem of 'false hits' that if you have an indexing assignment e.g. ...

doc | subject
3   | calcimycin + standards, aspirin + administration & dosage

... then a searcher querying for 'calcimycin AND administration & dosage' meaning to find documents about the administration and dosage of calcymicin, would erroneously receive document 3 in the result set?


Received on Wednesday, 19 October 2005 11:19:40 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:06 UTC