- From: <bugzilla@wiggum.w3.org>
- Date: Sun, 13 Aug 2006 08:37:32 +0000
- To: public-qt-comments@w3.org
- CC:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=3596
Summary: second order aspect of scoring expressions
Product: XPath / XQuery / XSLT
Version: Working drafts
Platform: Macintosh
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Full Text
AssignedTo: sihem@research.att.com
ReportedBy: martin@x-hive.com
QAContact: public-qt-comments@w3.org
I've posted this to the list but found out later that it might be better to
submit a bug report.
The full text specification extends the XQuery processing model to allow for a
second-order aspect of functions and it appears to me values are somewhat
cheating around the normal flow of XDM instances in XQuery using this
mechanism. This seems a bit strange, as it does not go so well with the XQuery
spec. Also, there seem to be some holes, e.g. what is score here:
> for $x score $score in //book[title ftcontains "hello"]/para[. ftcontains "world"] return $score
The score of the title, or the score of the para? I think this problem occurs
because of the score values sneaking around normal XQuery evaluation order.
Now I wonder if this couldn't be greatly simplified by providing just two full
text keywords, e.g. "ftmatches" returning an xs:boolean and "ftscore" returning
an xs:double in [0.1]. "ftmatches" could be used for boolean conditions:
> //book[. ftmatches "hello" && "world"]
And "ftscore" if the user needs more control over relevance:
> for $b in //book
> let $score := $b ftscore "hello" && "world"
> where $score > 0.5
> order by $score descending
> return $b
The definition of what score is a "match" could be an option, e.g.
> declare option fts:match-score := 0.5;
Or completely arbitrary and application defined (as in the current spec, I
think).
As this only adds completely normal XQuery expressions returning XDM instances
I think this would greatly simplify both the processing model, the application
for the user and the implementation for vendors (which is of course why I write
this, I'm lazy :-)).
I can't quite come up with a limitation of this concept over the one with the
special score keywords, functions etc. Am I missing something?
Received on Sunday, 13 August 2006 08:37:39 UTC