W3C home > Mailing lists > Public > public-qt-comments@w3.org > September 2016

[Bug 29752] [XSLT30]two accumulator examples using count(tokenize(., '\s+')) respectively count(tokenize(., '\W+')) to count words give odd results

From: <bugzilla@jessica.w3.org>
Date: Fri, 30 Sep 2016 11:47:31 +0000
To: public-qt-comments@w3.org
Message-ID: <bug-29752-523-Ej9cIWuvqr@http.www.w3.org/Bugs/Public/>

--- Comment #2 from Martin Honnen <martin.honnen@gmx.de> ---
(In reply to Michael Kay from comment #1)

> I'm inclined though to use the new XPath 3.1 tokenize#1
> <xsl:accumulator-rule match="text()" 
>          select="$value + count(tokenize(.))"/>
> which gives the correct answer 14 (whether or not we strip whitespace text
> nodes) without making the example a lot more complicated.

That looks much better than the previous approaches using tokenize(., '\W+'). 

It is still easily possible to construct input samples like

<p>He asked:"Does it work?"</p>

where the analyze-string count on \w+ sequences would work better but I agree
the examples on accumulators need to be short to demonstrate the use of
accumulators and not to present more complicated attempts on word counting.

You are receiving this mail because:
You are the QA Contact for the bug.
Received on Friday, 30 September 2016 11:47:44 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:58:02 UTC