[Bug 1394] Improvement to fn:tokenize function from bugzilla@wiggum.w3.org on 2005-05-12 (public-qt-comments@w3.org from May 2005)

From: <bugzilla@wiggum.w3.org>
Date: Thu, 12 May 2005 19:49:45 +0000
To: public-qt-comments@w3.org
Cc:
Message-Id: <E1DWJgn-0005qR-OF@wiggum.w3.org>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1394





------- Additional Comments From mike@saxonica.com  2005-05-12 19:49 -------
Thanks for the comment, Mukul. We did try to design a function that provided
this capability but found that it was too difficult to do as a pure function
because of the complexity of the result. Providing access to a secondary result
using an ancillary function delim() might seem natural in an XSLT context, but
it doesn't fit the stricter functional style of XPath and XQuery. XQuery avoids
such context-dependent functions because they make the job of the optimizer much
harder.

So instead we provided this functionality in XSLT through the xsl:analyze-string
instruction, which has two sub-instructions, matching-substring and
non-matching-substring. This is similar to tokenize() except that both the
tokens and the separators are returned (and you can also get access to the
matched subgroups within the matched pattern using the ancillary regex-group()
function).

You might also be interested that in Saxon I have provided the functionality of
xsl:analyze-string as an extension function saxon:analyze-string so that it is
available to XQuery users. This exploits Saxon's support for higher-order
functions: once XQuery supports higher-order functions in some future release it
will be much easier to design functions that do this kind of job.

This is a personal response, you will get an official response from the WGs in
due course.

Michael Kay

Received on Thursday, 12 May 2005 19:49:49 UTC