Re: [xslt2 func/op] tokenizing "abba" to ("a","b","b","a")

Kay, Michael wrote:
> We decided that fn:tokenize("abba", "") should be an error; more 
> specifically, fn:tokenize($in, $regex) is an error if fn:matches("", 
> $regex) is true.

I do not agree that this is a good decision. Many programmers will be 
annoyed to find out that this common and useful functionality is 
missing, at least I will.

> This means we are removing the functionality for fn:tokenize to split a 
> string into its individual characters.

I see no reason to do so, and I advise to keep it.

> There are other ways of doing 
> this. We looked at the specs (and actual behavior) for Perl and Java, 
> with different settings of the "limit" parameter, and decided that 
> choosing any one of the available behaviors was likely to be confusing 
> to a significant number of our users.

Removing it completely is *much* more of a problem. Logically it makes 
sense to do tokenize("abba", ""), so you should be able to implement 
this functionality in the spec, no matter what other languages do. There 
always are different specifications, it should't keep you from 
completing yours.

> Making it an error

Specifying a separator by supplying a regex which matches a zero-length 
string *is not and error*. If the spec or an XSLT processor tells me so 
then this is not helpful at all.

> keeps our 
> options open for the future, whereas if we get it wrong we are stuck 
> with it for ever.

I invested weeks of dicussion to improve the one "abba" example in the 
spec and to potentially improve the functionality of this function, just 
to be informed that instead of improving it you decided to remove it.

Please offer this basic, common, and helpful functionality to your users.

I don't think the spec will improve when more and more stuff gets 
removed, there are better ways to improve it.

Tobi

Received on Tuesday, 23 September 2003 06:28:46 UTC