W3C home > Mailing lists > Public > public-qt-comments@w3.org > May 2009

[Bug 6946] New: [FT] Test Suite - Wildcards

From: <bugzilla@wiggum.w3.org>
Date: Sat, 23 May 2009 17:06:07 +0000
To: public-qt-comments@w3.org
Message-ID: <bug-6946-523@http.www.w3.org/Bugs/Public/>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6946

           Summary: [FT] Test Suite - Wildcards
           Product: XPath / XQuery / XSLT
           Version: Candidate Recommendation
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Full Text 1.0
        AssignedTo: jim.melton@acm.org
        ReportedBy: christian.gruen@gmail.com
         QAContact: public-qt-comments@w3.org


Dear Pat, dear all,

I eventually decided to implement my own wildcard evaluator to disallow the
support more sophisticated wildcard expressions by a regular expression
matcher.

As a consequence, I would now expect empty results for the following three
results:

[1] ftwildcard-q4.xq
    .//content ftcontains "task?" with wildcards

As the question mark is not preceded by a period, the literal term "task?" will
be matched against the text.


[2] ftwildcard-q13.xq
    $cont ftcontains "specialist\."

As the period is preceded by a backslash, the resulting literal term is
"specialist."


[3] ftwildcard-q14.xq
    $cont ftcontains "nex.\?"

The period will be treated as a placeholder for one arbitrary character, and
the question mark will be searched as literal.


Indeed the "ftusecases.xml" documents contains the substring "task?",
"specialist.", and "next?". However, if the full-text tokenizer interprets
periods as sentence delimiters, and not as parts of tokens, the periods will
not be passed on to the wildcard evaluator. This is why both of the following
queries

  "task?" ftcontains "task?",
  "task?" ftcontains "task?" with wildcards

will lead to the internal comparison "task" <-> "task?", and will both return
false.

The tokenizer could add periods and other special characters to the returned
tokens. This, however, would lead to surprising results with normal ftcontains
operations:

  "is this a task?" ftcontains "task"  ->  false


Looking forward to your reply - thanks,
Christian


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Saturday, 23 May 2009 17:18:00 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:14:57 GMT