- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 11 Dec 2008 21:07:50 +0000
- To: public-qt-comments@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6303 Petr Pleshachkov <peter.pleshachkov@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |peter.pleshachkov@gmail.com --- Comment #3 from Petr Pleshachkov <peter.pleshachkov@gmail.com> 2008-12-11 21:07:49 --- (In reply to comment #2) But according to the spec: "the distance between the two is M2's starting position minus M1's ending position, minus 1.". So, for the first match we should get the distance = 5 - 2 - 1 = 2. Is it right ? By the way, section 3.6.3 contains example: "/books/book ftcontains "web" ftand "site" ftand "usability" distance at most 2 words" with the following explanation: "The following expression returns false: The search context does contain the phrase "The usability of a Web site", in which the tokens "usability" and "Web" have a distance of 2 words, and the tokens "Web" and "site" have a distance of 0 words, both of which satisfy the constraint distance at most 2 words. However, the problem is that "usability" and "site" have a distance of 3 words, which does not satisfy the constraint, and so the distance selection yields no matches, and the expression as a whole yields false. (The phrase "Improving Web Site Usability" would satisfy the given full-text selection, but it occurs in an attribute value, and so is not subject to tokenization.)" But the spec says that we have to check the distance between "successive pair of matches" So, we have to check the distance constraint for pairs: ("usability", "web") and ("Web", "site"), but not for the pair ("usability", "site") This is followed from the formal function as well: declare function fts:ApplyFTWordDistanceAtMost ( $allMatches as element(fts:allMatches), $n as xs:integer ) as element(fts:allMatches) { <fts:allMatches stokenNum="{$allMatches/@stokenNum}"> { for $match in $allMatches/fts:match let $sorted := for $si in $match/fts:stringInclude order by $si/fts:tokenInfo/@startPos ascending, $si/fts:tokenInfo/@endPos ascending return $si where if (fn:count($sorted) le 1) then fn:true() else every $index in (1 to fn:count($sorted) - 1) satisfies fts:wordDistance( $sorted[$index]/fts:tokenInfo, $sorted[$index+1]/fts:tokenInfo ) <= $n return <fts:match> { fts:joinIncludes($match/fts:stringInclude), for $stringExcl in $match/fts:stringExclude where some $stringIncl in $match/fts:stringInclude satisfies fts:wordDistance( $stringIncl/fts:tokenInfo, $stringExcl/fts:tokenInfo ) <= $n return $stringExcl } </fts:match> } </fts:allMatches> }; So, is the example correct ? > [personal response:] > > Re your point #2: Yes, I think that's a mistake in the specification. > Where we say: > It is 1 for the first pair and 3 for the second in the first case, > and 2 and 1 in the second. > We should instead say something like: > For the first Match, the word distance between > the two TokenInfos is 3 (startPos 5 - endPos 2), > and for the fifth Match, it's 2 (startPos 27 - endPos 25). > -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Thursday, 11 December 2008 21:07:58 UTC