- From: <bugzilla@wiggum.w3.org>
- Date: Mon, 24 Nov 2008 20:51:18 +0000
- To: public-qt-comments@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6195 Mary Holstege <holstege@mathling.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |holstege@mathling.com Status|NEW |RESOLVED Resolution| |FIXED --- Comment #1 from Mary Holstege <holstege@mathling.com> 2008-11-24 20:51:17 --- The WG discussed this issue and agreed we need to augment the testsuite. Please note that we have not yet completely implemented the use of this new system throughout the testsuite. If you are satisfied with this resolution, please mark the bug as closed. Please note the following addition to the instructions: <quote> Special Sources: Stop Word List, Thesaurus, and Stemming Dictionary The stopwords, thesaurus, and stemming-dictionary sources are not intended to be used directly in the form in which they are given, but to provide information to those running the test suite about the expectations a particular test has about various implementation-specific aspects of the execution context. Implementations are expected to provide equivalent information to the query, but in whatever form is appropriate in their context. A stopwords source is a plain text file containing list of stop words, one per line. When a query references this stop word list, the implementation is expected to provide that list of stop words to the query. A thesaurus source is an XML document defined against the thesaurus.xsd XML Schema. When a query references this thesaurus, the implementation is expected to provide equivalent thesaurus information to the query. The stemming-dictionary is a plain text file containing lines of whitespace-separated tokens. Each token on the line should stem to the first token on the line. When the catalog entry for a query references a stemming dictionary, the implementation is expected to provide stemming equivalent to the rules given in the stemming dictionary. </quote> The basic idea is that there are three new kinds of sources: A stop word list, which is just a text file, one stop word per line; a thesaurus, which is an XML file as per the schema; and a stemming dictionary, which is one stem per line. The catalog descriptions for stop word lists and thesauri include a URI that matches up with the one in the query. This is similar to the handling of schemas. The stemming dictionary has no URI: it is the resource ID that matters and it is used to define the relevant stem equivalents when it makes a difference for stemmed search. ** Changes to XQFTTSCatalog.xsd/xml: Add three new kinds of source roles: stopwords, thesaurus, and stemming-dictionary, and corresponding elements in the sources part of the catalog. Add an aux-URI element to the test-case itself. Queries that use a URI for a stop words list should have an aux-URI with role="stopwords"; queries that us a URI for a thesaurus should have an aux-URI with role="thesaurus". Queries that rely on particular stemming behaviour should have an aux-URI with role="stemming-dictionary". ** Examples: * Stop words: TestSources/stopwords.txt: and the then it of in Catalog description: <stopwords ID="stopwords1" uri="http://bstore1.example.com/StopWordList.xml" FileName="stopwords.txt" Creator="Full-Text Task Force"> <description last-mod="2008-11-10">Stop word list for use cases</description> </stopwords> Query description using stopwords (with stop words at "http://bstore1.example.com/StopWordList.xml"): <test-case is-XPath2="true" name="stopwords-1" FilePath="Expressions/Operators/CompExpr/FTContainsExpr/FTSelection/MatchOptions/FTStopWord/" scenario="standard" Creator="Full-Text Task Force"> <description>Example using stop words</description> <spec-citation spec="XQueryFullText" section-number="3.4.7" section-title="Stop Word Option" section-pointer="ftstopwordoption"/> <query name="stopword-1" date="2008-11-10"/> <aux-URI role="stopwords">stopwords1</aux-uri> <input-file role="principal-data" variable="input-context">ftusecases</input-file> <output-file role="principal" compare="XML">stopwords-1.xml</output-file> </test-case> * Thesaurus: (Schema is TestSources/thesaurus.xsd) TestSources/soundex.xml: <thesaurus xmlns="http://www.w3.org/xqftts/thesarus"> <entry> <term>Marigold</term> <synonym> <term>Merrygould</term> <relationship>sounds like</relationship> </synonym> </entry> </thesaurus> Catalog description: <thesaurus ID="soundex" uri="http://bstore1.example.com/UsabilitySoundex.xml" FileName="soundex.txt" Creator="Full-Text Task Force"> <description last-mod="2008-11-10">Soundex thesaurus for examples</description> </thesaurus> Query using thesaurus: (with thesaurus at "http://bstore1.example.com/UsabilitySoundex.xml"): <test-case is-XPath2="true" name="thesaurus-1" FilePath="Expressions/Operators/CompExpr/FTContainsExpr/FTSelection/MatchOptions/FTThesaurus/" scenario="standard" Creator="Full-Text Task Force"> <description>Example using stop words</description> <spec-citation spec="XQueryFullText" section-number="3.4.3" section-title="Thesaurus Option" section-pointer="ftthesaurusoption"/> <query name="thesaurus-1" date="2008-11-10"/> <aux-URI role="thesaurus">soundex</aux-uri> <input-file role="principal-data" variable="input-context">ftusecases</input-file> <output-file role="principal" compare="XML">thesaurus-1.xml</output-file> </test-case> * Stemming TestSources/english-stems.txt improve improves improving improved dog dogs cat cats train trains training trained error errors Catalog description: <stemming-dictionary ID="english-stems" FileName="english-stems.txt" Creator="Full-Text Task Force"> <description last-mod="2008-11-10">English stems</description> </stemming-dictionary> Query using thesaurus: (with stemming) <test-case is-XPath2="true" name="stemming-1" FilePath="Expressions/Operators/CompExpr/FTContainsExpr/FTSelection/MatchOptions/FTStemming/" scenario="standard" Creator="Full-Text Task Force"> <description>Example using stemming</description> <spec-citation spec="XQueryFullText" section-number="3.4.4" section-title="Stemming Option" section-pointer="ftstemoption"/> <query name="stemming-1" date="2008-11-10"/> <aux-URI role="stemming-dictionary">english</aux-uri> <input-file role="principal-data" variable="input-context">ftusecases</input-file> <output-file role="principal" compare="XML">stemming-1.xml</output-file> </test-case> -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Monday, 24 November 2008 20:51:29 UTC