W3C home > Mailing lists > Public > public-qt-comments@w3.org > August 2003

Re: [xslt2 func/op] tokenizing "abba" to ("a","b","b","a")

From: David Carlisle <davidc@nag.co.uk>
Date: Tue, 19 Aug 2003 09:56:48 +0100
Message-Id: <200308190856.JAA17487@penguin.nag.co.uk>
To: tobiasreif@pinkjuice.com
CC: public-qt-comments@w3.org, ashokma@microsoft.com

My reading of the F&O spec agress with Tobi's that .? should not use the
empty string as a separator, but rather each character should be taken
as a separator, and so the result should be the empty sequence.

The sentence 
> If the supplied $pattern matches a zero length string...

should anyway (as this thread shows) be clarified, but as it stands I
think the natural interpretation of "matches" here is the
interprestation of matches used in replace(), and in this case
that is a greedy match as .? is greedy, so since it is the entire regexp
it is equivalent to . and will match each character.
While .? _could_ match an empty string, it doesn't here so the empty
string should not be used as separator.

This sentence should not be changing the matching rule of the regexp,
just specifying that if the effective separator is "" that the behaviour
is to split each character rather than error, or take the whole string
as one token.


This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
Received on Tuesday, 19 August 2003 04:57:12 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:45:13 UTC