- From: <bugzilla@wiggum.w3.org>
- Date: Tue, 14 Apr 2009 01:43:32 +0000
- To: public-qt-comments@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6809 Summary: [FT] Test Suite - Thesaurus Queries Product: XPath / XQuery / XSLT Version: Candidate Recommendation Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Full Text 1.0 AssignedTo: jim.melton@acm.org ReportedBy: christian.gruen@gmail.com QAContact: public-qt-comments@w3.org Dear task force, I decided to add a basic Thesaurus implementation to BaseX to support and test the remaining queries. I frankly admit that I'm no Thesaurus expert at all, so I mainly focused on the hints in the specification and the existing tests. As I'm not sure if I completely understood what's going on in the test examples, here are some more questions/bug indications: [1] ft-3.4.3-examples-q1 The usability.xml thesaurus file returns the synonym "tasks" for the query input "duties" - but the queried document node includes only the word in singular ("task" instead of "tasks"). Is this intended? [2] ft-3.4.3-examples-q2 The thesaurus offers the terms "navigation", "layout" and "terminology" for the query phrase "web site components", but all of the terms are not included in the tested document node. [3] ft-3.4.3-examples-q3.xq In this query, words similar to "Merrygould" are to be found. As "case insensitive" is the default options, the term is converted to "merrygould" in my tests - so the thesaurus doesn't return any result. [4] Probably a naïve question: do all thesaurus entries work in a "bidirectional" way? I.e., if "A" is a synonym for "B", do I get "A" if I look for "B", and "B" if I look for "A"? Next to that, are all synonym bidirectional? One could argue that "Marigold" sounds like "Merrygould", but "Merrygould" doesn't sound like "Marigold". In the latter case, the upper query [3] would only return results in the direction opposite to the current one. [5] ft-3.4.3-expressions-q3 The thesaurus returns "software" for the term "program"; this term seems to be included in two books (number 1 and 3), but the current result contains only book 1. [6] ft-3.4.3-expressions-q5 ..references the missing file "TechnicalThesaurus.xml". [7] ft-3.4.3-expressions-q6 parentheses missing before "default" and after "NT". I guess that the Thesaurus should also accept the original query terms and not only synonyms; is this correct? If "yes", then book number 3 should be added as result, as it contains the term "Computers". [8] thesaurus-queries-results-q2 / q2b As the used relationship is "narrower terms" here (instead of "NT" or "narrower term") - do you expect implementations to recognize all kinds of writings, or ? [9] thesaurus-queries-results-q5 / q5b / q6 / q6b "spellcheck.xml" and "OurTaxonomy.xml" don't exist yet. [10] full-text-composability-queries-results-q2b Parsing issue: "]" missing after "stemming" [11] full-text-composability-queries-results-q3 / q3b Parsing issue: some opening and closing parentheses are missing. I'm currently running the Thesaurus as the last match option, as I saw that the execution order of match options seems to be implementation defined. It may well be that different orders could result in different results - but I haven't really thought this through. Concluding, as I indicated in the beginning, my knowledge on Thesauri is very limited. So maybe it will be helpful to directly talk to one of you in near future to get more insight in some of the open issues.. Christian -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Tuesday, 14 April 2009 01:43:41 UTC