W3C home > Mailing lists > Public > public-qt-comments@w3.org > October 2007

[Bug 5122] [FT] Section 4 Tokenization constraint

From: <bugzilla@wiggum.w3.org>
Date: Mon, 01 Oct 2007 19:40:35 +0000
CC:
To: public-qt-comments@w3.org
Message-Id: <E1IcR87-0001Xy-Pg@wiggum.w3.org>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5122

           Summary: [FT] Section 4 Tokenization constraint
           Product: XPath / XQuery / XSLT
           Version: Last Call drafts
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Full Text
        AssignedTo: jim.melton@acm.org
        ReportedBy: holstege@mathling.com
         QAContact: public-qt-comments@w3.org


The definition of tokenization in section 4 includes the rule:

"The tokenizer MUST, when tokenizing two equal items, identify the same tokens
in each."

This is too strong.  The context in which the items arise may impact the
tokenization of those items.  As a simple example: the parent element may
provide different xml:lang attributes. Other implementation-specific 
configuration information may apply to ancestors of the item and impact 
how the item itself is tokenized.
Received on Monday, 1 October 2007 19:40:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:14:48 GMT