- From: Jochen Doerre <DOERRE@de.ibm.com>
- Date: Mon, 2 May 2005 02:17:53 +0200
- To: andrew.cao@cisra.canon.com.au
- Cc: public-qt-comments@w3.org, member-query-fttf@w3.org
Andrew, thanks again for pointing out this error in the semantics of the distance functions. Sorry for the late response. Here is how the function fts:ApplyFTWordDistanceExactly should be. Please note the change in the return clause. As a result your query[2] will then evaluate to False, because SE-3 will not be eliminated. declare function fts:ApplyFTWordDistanceExactly( $matchOptions as element(matchOptions, fts:FTMatchOptions), $allMatches as element(allMatches, fts:AllMatches), $n as xs:integer) ) as element(allMatches, fts:AllMatches) { <allMatches> { for $match in $allMatches/match let $sorted = for $si in $match/stringInclude order by $si/tokenInfo/@pos ascending return $si where every $idx in (1 to fn:count($sorted) - 1) satisfies fts:wordDistance( $sorted[$idx]/tokenInfo, $sorted[$idx+1]/tokenInfo, $matchOptions) = $n return <match> {$match/stringInclude} { for $stringExcl in $match/stringExclude where some $stringIncl in $match/stringInclude satisfies fts:wordDistance( $stringIncl/tokenInfo, $stringExcl/tokenInfo, $matchOptions) = $n return $stringExcl } </match> } </allMatches> } So, yes, as you pointed out it is sufficient for a StringExclude to be in the required distance with one of the remaining StringIncludes to be kept. Actually the same correction has to be applied to the other distance functions (replacing "where every $stringIncl" with "where some $stringIncl" in the return clause). The corrections will be included in the next Working Draft. I add some more examples showing how distance and negation are intended to interact. query[2] = . ftcontains ("word1" && "word2" && ! "word3") with distance exactly 0 words The query matches, for example: <node> ... word0 word1 word2 word4 ... </node> and also <node> ... word0 word2 word1 word4 ... </node> in case none of the given words are matched by "word3". Loosely speaking, that query returns true for a node, if it contains word1 and word2 adjacently in any order and not preceeded or succeeded by an occurrence of word3. Hence, the following do not match: <node> word1 word2 word3 </node> <node> word2 word1 word3 </node> <node> word3 word2 word1 </node> <node> word3 word1 word2 </node> <node> word1 word4 word2 </node> <!-- word1 and word2 need to be adjacent --> <node> word13 word2 </node> <!-- where word13 is matched by both word1 and word3 --> Yours sincerely / Mit freundlichen Grüßen, Jochen Dörre __________________________________________ IBM Germany Böblingen Laboratory DB2 Information Management Software Phone: +49-7031-16-2992, Fax: -4891, Email: doerre@de.ibm.com > Dear editors, > > When I have a node: <Node>word1 word2 word3</Node> > > I apply the query[1]: > /Node ftcontains ("word1" && "word2" && "word3") with distance exactly 0 > words > I will get the AllMatches[1] as: > --- AllMatches > --- Match > --- StringInclude (pos = 1) > --- StringInclude (pos = 2) > --- StringInclude (pos = 3) > The final result is True. > > I apply the query[2]: > /Node ftcontains ("word1" && "word2" && ! "word3") with distance exactly > 0 words > I seem to get the AllMatches[2] as: > --- AllMatches > --- Match > --- StringInclude (pos = 1) > --- StringInclude (pos = 2) > The final result is also True. > > The reason for AllMatches[2] is that the StringExclude (pos = 3) which > is generated by ! "word3" has been dropped, according to semantics of > ApplyFTWordDistanceExactly, because SE-3 does not have a word distance 0 > with both SI-1 and SI-2. > > Are my two results correct? If they are correct, would this be > inconsistent? Or what is the intuition when "word3" is a don't-care? > Can I compare SE-3 to any one of SI-1 and SI-2, not to both of them? > > Thanks,
Received on Monday, 2 May 2005 00:18:01 UTC