W3C home > Mailing lists > Public > public-qt-comments@w3.org > March 2003

Re: FTS comments

From: Jonathan Robie <jonathan.robie@datadirect-technologies.com>
Date: Mon, 24 Mar 2003 12:24:56 -0500
Message-Id: <5.2.0.9.0.20030324121707.0118fd70@ncmail.datadirect-technologies.com>
To: kai.grossjohann@uni-duisburg.de (Kai Gro▀johann ), public-qt-comments@w3.org

At 10:18 PM 3/22/2003 +0100, Kai Gro▀johann wrote:
>* Application of SCORE to non-text conditions.
>
>   I believe that vagueness and uncertainty, the central issues of
>   Information Retrieval, are vital features for systems even outside
>   the domain of full text.  Consider the infamous used-car database
>   example: say the user searches for a white Lincoln Continental from
>   2001 with a given mileage (is that the right word? number of miles
>   run by that car is what I mean) and price.  There are no full text
>   conditions in this example.  Yet, what happens if there are no cars
>   fulfilling the exact condition but only cars that are "close
>   matches"?  One approach would be to interpret the conditions
>   vaguely.  Another approach requires the user to specify another
>   query to find those "close matches".  However, the latter approach
>   requires the user to know which of the query conditions to relax to
>   find that close match, and thus requires knowledge of the contents
>   of the database.  Surely this is not desirable: if the user knew
>   what's in the database, why search?
>
>   Therefore, the "best match" approach is important also for non-text
>   conditions.

I'm currently not very active in full-text, but I wanted to offer 
my  thoughts on this.

This has come up a number of times. Here's my opinion: the semantics of 
"close match" for a structure can be highly application-dependent, and it's 
not clear to me whether there is an approach to approximate structural 
match that will work at all well across a variety of structures from 
different application areas.  For instance, in this case, is a car a close 
match if the price is too high, or is that an absolute criterion for the 
user? To get useful results, it may be the case that the user really does 
need to let us know, in some way, which criteria to relax. There may be 
literature or products that tell us that we actually know how to do this 
well in the general case - do you know of any that we might want to look at?

Jonathan 
Received on Monday, 24 March 2003 12:25:04 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:14:24 GMT