W3C home > Mailing lists > Public > public-qt-comments@w3.org > September 2016

[Bug 29752] [XSLT30]two accumulator examples using count(tokenize(., '\s+')) respectively count(tokenize(., '\W+')) to count words give odd results

From: <bugzilla@jessica.w3.org>
Date: Fri, 30 Sep 2016 10:00:38 +0000
To: public-qt-comments@w3.org
Message-ID: <bug-29752-523-H5Q9xlqJ3s@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=29752

--- Comment #1 from Michael Kay <mike@saxonica.com> ---
Thanks for pointing this out and sorry for the delay in responding.

There seem to be two things wrong with count(tokenize(., '\W+'))

(a) it counts 1 for a whitespace text node

(b) for other text nodes, it gives a count that is 1 too high.

One solution would simply to be to subtract 1 from the count.

I'm inclined though to use the new XPath 3.1 tokenize#1

<xsl:accumulator-rule match="text()" 
         select="$value + count(tokenize(.))"/>

which gives the correct answer 14 (whether or not we strip whitespace text
nodes) without making the example a lot more complicated.

Added as test accumulator-058.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Friday, 30 September 2016 10:00:56 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:58:02 UTC