- From: Mike Taylor <mike@indexdata.com>
- Date: Wed, 25 Feb 2004 13:59:24 GMT
- To: azaroth@liverpool.ac.uk
- Cc: ajk@mds.rmit.edu.au, www-zig@w3.org
> Date: Fri, 20 Feb 2004 12:51:56 +0000 (GMT) > From: Robert Sanderson <azaroth@liverpool.ac.uk> > > > I don't think a scan interface has any business exposing dirty > > laundry such as the stemmed term "happi" to the poor, innocent > > user. > > There's very little choice, as one stem might be made up of several > different words (unhappiness, happy, happily) Where's the problem? Pick one, and use that: a stemmed search will find them all anyway, so any one representative of the equivalence class is as good as any other. If you like, you can provide: term = "happy" (arbitrarity chosen) displayTerm = "unhappiness, happy, happily" > One of these could be selected at random for term, but then the > termlist might not be sorted. (eg if the stem 'happi' came from > 'unhappiness') So sort it! > If the user didn't want to scan using a stemming algorithm, then > they shouldn't have asked for it! :) Ah ... The well-known and much admired Don't Do That Then defence. Well, yes; but still, we should respect what the elements are fundamentally _for_. _/|_ _______________________________________________________________ /o ) \/ Mike Taylor <mike@indexdata.com> http://www.miketaylor.org.uk )_v__/\ "In art criticism and literary criticism, it is normal to come across long passages which are almost completely lacking in meaning" -- George Orwell. -- Listen to my wife's new CD of kids' music, _Child's Play_, at http://www.pipedreaming.org.uk/childsplay/
Received on Wednesday, 25 February 2004 09:00:31 UTC