[FT] Scoring expressions from Martin Probst on 2006-08-10 (public-qt-comments@w3.org from August 2006)

From: Martin Probst <martin@x-hive.com>
Date: Thu, 10 Aug 2006 11:10:28 +0200
To: public-qt-comments@w3.org
Message-Id: <93774B82-8DB6-43E1-9089-0BA789BA829A@x-hive.com>

Hi,

I'm new to the full text specification so this might be an old issue  
- I didn't find anything related in the archive though.

The full text specification extends the XQuery processing model to  
allow for a second-order aspect of functions and it appears to me  
values are somewhat cheating around the normal flow of XDM instances  
in XQuery using this mechanism. This seems a bit strange, as it does  
not go so well with the XQuery spec. Also, there seem to be some  
holes, e.g. what is score here:
 > for $x score $score in //book[title ftcontains "hello"]/para[.  
ftcontains "world"] return $score
The score of the title, or the score of the para? I think this  
problem occurs because of the score values sneaking around normal  
XQuery evaluation order.

Now I wonder if this couldn't be greatly simplified by providing just  
two full text keywords, e.g. "ftmatches" returning an xs:boolean and  
"ftscore" returning an xs:double in [0.1]. "ftmatches" could be used  
for boolean conditions:
 > //book[. ftmatches "hello" && "world"]
And "ftscore" if the user needs more control over relevance:
 > for $b in //book
 > let $score := $b ftscore "hello" && "world"
 > where $score > 0.5
 > order by $score descending
 > return $b
The definition of what score is a "match" could be an option, e.g.
 > declare option fts:match-score := 0.5;
Or completely arbitrary and application defined (as in the current  
spec, I think).

As this only adds completely normal XQuery expressions returning XDM  
instances I think this would greatly simplify both the processing  
model, the application for the user and the implementation for  
vendors (which is of course why I write this, I'm lazy :-)).

I can't quite come up with a limitation of this concept over the one  
with the special score keywords, functions etc. Am I missing something?

Regards,
Martin

-- 
Martin Probst
X-Hive Corporation
martin@x-hive.com

Received on Thursday, 10 August 2006 09:12:19 UTC