[Bug 6303] New: FT: TokenInfo and StringInclude definition

http://www.w3.org/Bugs/Public/show_bug.cgi?id=6303

           Summary: FT: TokenInfo and StringInclude definition
           Product: XPath / XQuery / XSLT
           Version: Candidate Recommendation
          Platform: PC
               URL: http://www.w3.org/TR/xpath-full-text-10/
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Full Text 1.0
        AssignedTo: jim.melton@acm.org
        ReportedBy: peter.pleshachkov@gmail.com
         QAContact: public-qt-comments@w3.org


Dear authors of XQuery Full-Text Specification,

Please clarify the following issues:

1. I am a bit confused with the definition of TokenInfo and StringInclude.

[Definition: A TokenInfo represents a contiguous collection of tokens
from an XML document. ]

[Definition: A StringInclude is a StringMatch that describes a
TokenInfo that must be contained in the document.]

 the UML Static Class diagram of AllMatches shows one-to one
correspondece between StringMatch and TokenInfo.

But from the XML Schema definition :

 <xs:element name="stringInclude"
             type="fts:stringMatch" />


 <xs:complexType name="stringMatch">
   <xs:sequence>
     <xs:element ref="fts:tokenInfo"/>
   </xs:sequence>
   <xs:attribute name="queryPos"
                 type="xs:integer"
                 use="required"/>
   <xs:attribute name="isContiguous"
                 type="xs:boolean"
                 use="required"/>
 </xs:complexType>

 <xs:complexType name="tokenInfo">
   <xs:attribute name="startPos"
                 type="xs:integer"
                 use="required"/>
   <xs:attribute name="endPos"
                 type="xs:integer"
                 use="required"/>
   <xs:attribute name="startSent"
                 type="xs:integer"
                 use="required"/>
   <xs:attribute name="endSent"
                 type="xs:integer"
                 use="required"/>
   <xs:attribute name="startPara"
                 type="xs:integer"
                 use="required"/>
   <xs:attribute name="endPara"
                 type="xs:integer"
                 use="required"/>
 </xs:complexType>

 <xs:element name="tokenInfo" type="fts:tokenInfo"/>

follows that StringMatch can contain a SEQUENCE of tokenInfo. So, we
have one-to many relationship.

Please, clarify the right relationship between StringMatch and tokenInfo.


2. In section  4.2.7.9 FTDistance you have an example: ("Ford Mustang"
ftand "excellent") distance at most 3 words

And you say at the end : "The result for the FTDistance selection
consists of only the first Match (with positions 1, 2, and 5) and the
fifth Match (with positions 25, 27, and 28), because only for these
Matches the word distance between consecutive TokenInfos is always
less than or equal to 3. It is 1 for the first pair and 3 for the
second in the first case, and 2 and 1 in the second."

Here for the first match you have 2 StringIncludes (shown on the diagram):
1) first StringInclude with startPos = 1 and endPos=2
2) second StringInclude with startPos = 5 (endPos = 5)

But what is the consecutive pairs ? It looks like with have 2
StringIncludes and have only ONE pair and distance = 5 - 2 -1 = 2, but
you say " It is 1 for the first pair and 3 for the second in the first
case" what defines something different.

Please, clarify how do you define the consecutive pairs ?


Thank you in advance,
Peter Pleshachkov


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Thursday, 11 December 2008 15:30:59 UTC