- From: Mike Taylor <mike@indexdata.com>
- Date: Fri, 20 Feb 2004 12:35:45 GMT
- To: ajk@mds.rmit.edu.au
- Cc: www-zig@w3.org
> Date: Fri, 20 Feb 2004 10:24:44 +1100 > From: Alan Kent <ajk@mds.rmit.edu.au> > > On Thu, Feb 19, 2004 at 11:35:53AM +0000, Robert Sanderson wrote: > > We return the stemmed term, eg 'happi' for happy, happily, happiness. This seems terribly wrong to me. > So returned 'term' values may be munged, but are used for searching. I agree that it should be a non-negotiable property of Scan that what comes back in the "term" can be re-submitted as a search and get sensible results. > This implies you have to guarantee any output of your stemmer can > be fed back into the stemmer and have the same value output again. > Otherwise the term from the scan could not be used for searching. > > In the case of soundex, this could be achieved by looking at the > term and saying "ooh, that looks like the output of the soundex > algorithm - I will just leave that alone". Such ad-hoc hackery (ad-hackery?) surely can't be right. I don't think a scan interface has any business exposing dirty laundry such as the stemmed term "happi" to the poor, innocent user. > This is also consistent with what Ashley does - if it has spaces, > munge it. If it does not have spaces, maybe its a scan term so don't > do anything to it. Ugh. _/|_ _______________________________________________________________ /o ) \/ Mike Taylor <mike@indexdata.com> http://www.miketaylor.org.uk )_v__/\ "Personally, I don't think its sexual dimorphism. I'm all for it, but not in this case" - Tracy L. Ford. -- Listen to my wife's new CD of kids' music, _Child's Play_, at http://www.pipedreaming.org.uk/childsplay/
Received on Friday, 20 February 2004 07:36:49 UTC