- From: <bugzilla@jessica.w3.org>
- Date: Tue, 09 Nov 2010 11:32:20 +0000
- To: public-qt-comments@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11272
Summary: [FT] Tokenization and wildcards
Product: XPath / XQuery / XSLT
Version: Candidate Recommendation
Platform: PC
OS/Version: Windows NT
Status: NEW
Severity: normal
Priority: P2
Component: Full Text 1.0
AssignedTo: jim.melton@acm.org
ReportedBy: tim@cbcl.co.uk
QAContact: public-qt-comments@w3.org
It is also unclear whether query and search context tokenization is necessarily
the same function, and how matching and implementation-defined tokenization
interact.
Section 4.1 Tokenization seems to address only the requirements of search
context tokenization (identification of tokens with position, sentence and
paragraph), and suggests a function of the form
declare function fts:tokenize( $searchContext as item(),
$language as xs:string? )
as element(fts:tokenInfo)* external;
$language is an argument, because Section 3.4.1 Language Option states that the
language options can affect tokenization.
Section 3.2 states:
"Otherwise, each of those strings is tokenized into a sequence of tokens as
described in Section 4.1 Tokenization. "
However, tokenization of the search tokens must use a different process,
because it must vary depending on the wildcard option and doesn't attempt to
identify sentence and paragraph boundaries, returning fts:queryToken values
rather than fts:tokenInfo values, . This suggests a function of the form:
declare function fts:tokenizeQuery( $ftWordsValue as xs:string*,
$language as xs:string?,
$wildcardOptionEnabled as xs:boolean )
as element(fts:queryToken)* external;
The $wildcardOptionEnabled argument specifies how the query tokenizer should
handle wildcard indicators.
Is my understanding correct?
--
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Tuesday, 9 November 2010 11:32:24 UTC