[Bug 6469] [FT] TestSuite issues

http://www.w3.org/Bugs/Public/show_bug.cgi?id=6469


Jim Melton <jim.melton@acm.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED




--- Comment #3 from Jim Melton <jim.melton@acm.org>  2009-02-06 18:46:15 ---
Christian, I think I disagree with your argument in comment 2
(http://www.w3.org/Bugs/Public/show_bug.cgi?id=6469#c2).  There, you said: 

>This is an interesting point for discussion. I had another look into the XQFT
>Tokenization section (4.1). If I get it right, the tokenizer won't care about
>characters which are not part of tokens; so I would expect the two following
>queries to return true:

>  'a b' ftcontains 'a.b' 
>  'a.b' ftcontains 'a b' 

I have just finished re-reading all of section 4.1 very carefully and I didn't
find any suggestion that punctuation was never (part of) a token.  In the first
of your two examples (quoted above), the first search context 'a b' has, I
believe, two tokens, 'a' and 'b'. There, I think we are in agreement.  However,
the search pattern 'a.b' has, I think, three tokens, 'a', '.', and 'b'. 
(Caution: tokenization is implementation-defined and I'm presuming a tokenizer
with certain behaviors that I believe are common in western-world language
products.)  Because there is no token corresponding to '.' in the search
context, I believe that query must return false. 

Similarly, in the second example, the search context has three tokens and the
search pattern has two.  Because the search context does not have two tokens
'a' and 'b' that are adjacent, I believe that this query must also return
false.  However, if the query had been written:
   'a.b' ftcontains 'a' ftand 'b'
then it would of course return true. 

If I'm wrong, please correct my analysis!

P.S., I'm working today on the remaining issues in your bug report to resolve
the ones I can and assign the others. 


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Friday, 6 February 2009 18:46:25 UTC