- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 16 Apr 2009 19:30:32 +0000
- To: public-qt-comments@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6830 Summary: [FT] Thesaurus vs other Match Options Product: XPath / XQuery / XSLT Version: Candidate Recommendation Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Full Text 1.0 AssignedTo: jim.melton@acm.org ReportedBy: christian.gruen@gmail.com QAContact: public-qt-comments@w3.org Hi again, I noticed that the evaluation of a combination of several match options with the Thesaurus may lead to different interpretations. My major question is if other match options influence the way the thesaurus works. An example: "improving" ftcontains "improve" with stemming This query should return true. If we add a thesaurus here.. "improving" ftcontains "optimizing" with stemming with thesaurus.. ...and if the thesaurus resvolves "optimize" to "improve", I am wondering if this query will return true, as the thesaurus entries would have to be stemmed as well. The same problem/question occurs with the default match options. E.g.: Are diacritics to be removed in the thesaurus? As a Thesaurus can get pretty large, similar to index structures, I would recommend to apply all match options while building and BEFORE querying the Thesaurus - otherwise, Thesaurus requests could get pretty expensive. This is why I would propose to extend section 3.4 of the specification: 1. The Language Option must be applied first 2. The Stemming Option must be applied before the Case Option and the Diacritics Option -> 3. The Thesaurus Option must be applied after all other options This will also make sense, as the Thesaurus might not be accessed at all if the query and document term equal anyway... "A" ftcontains "A" with thesaurus... -> should yields true without even checking the thesaurus I just discovered the following sentence in the first section of the Specs.. "The WGs particularly solicit feedback regarding how thesauri are to be used in combination." So I hope that my discussion here contributes a little to this issue. Christian -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Thursday, 16 April 2009 19:30:45 UTC