[Bug 6469] [FT] TestSuite issues

http://www.w3.org/Bugs/Public/show_bug.cgi?id=6469





--- Comment #4 from Christian Gruen <christian.gruen@gmail.com>  2009-02-06 19:07:23 ---
Hi Jim,

thank you for commenting the tokenization issue. If I get it right,
tokenization of punctuation eventually is implementation dependent as well, so
- as long as it is not specified if punctuation is to be treated as own token -
the two discussed queries..

  'a b' ftcontains 'a.b' 
  'a.b' ftcontains 'a b'

..can either return true or false. Concerning the examples in 4.1.1,
punctuation is ignored in the tokenization process - or, to put it differently,
space and punctuation is treated the same way here:

  "Ford Mustang 2000, 65K, excellent [...]"
  -> Ford(1) Mustang(2) 2000(3), 65K(4), excellent(5)

So the following queries..

  "Ford Mustang 2000, 65K" ftcontains "2000 65K"
  "Ford Mustang 2000 65K" ftcontains "2000, 65K"

..should return true for these examples. What do you think?

Christian, BaseX Team 
http://www.basex.org


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Friday, 6 February 2009 19:07:34 UTC