W3C home > Mailing lists > Public > public-qt-comments@w3.org > September 2006

[Bug 3747] [FT] FTMatchOptions

From: <bugzilla@wiggum.w3.org>
Date: Mon, 18 Sep 2006 20:07:22 +0000
CC:
To: public-qt-comments@w3.org
Message-Id: <E1GPPOk-0006TN-QY@wiggum.w3.org>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=3747

           Summary: [FT] FTMatchOptions
           Product: XPath / XQuery / XSLT
           Version: Working drafts
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Full Text
        AssignedTo: jim.melton@acm.org
        ReportedBy: holstege@mathling.com
         QAContact: public-qt-comments@w3.org


Section 3.2 (FTMatchOptions)

Second sentence after EBNF. (Technical)
Most of the match options do not, in fact, modify the sets of tokens and
phrases in the query; the way the semantics are defined these days, they impact
how the tokens and phrases in the query are matched against tokens and phrases
in the search items. True, an implementation may process, say, a stemming
option, by constructing a different set of search items. All that said, we also
need to be crisper about whether match options apply to how the query string is
processed or how the documents are processed.  Should the language of the
document really affect how search strings are stemmed?  And a search against a
collection of documents? Here is a case where I think we're saying you cannot
tokenize your documents until you see the query, which is wrong.  What we
should be saying is that these options may apply to which query tokens are
produced and either thereby or separately affect how matching is done.  We
should not say that they affect how documents are tokenized, because then you
are precluding any but small-scale implementations.  For up-front document
indexing, tokenization may produce several different results in different
indices, and the match options then become matters of selection of the
appropriate index.
Received on Monday, 18 September 2006 20:07:37 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:57:14 UTC