Re: [xslt2 func/op] tokenizing "abba" to ("a","b","b","a")

Ashok Malhotra wrote:
 > Yes, but the spec says that if reluctant quantifiers are used, i.e.
 > those with ?, then the regex "matches the shortest possible substring
 > consistent with the match as a whole succeeding."

This doesn't seem to apply to the regex in the example in the spec, 
namely ".?".

It's up to you and I don't want to annoy you, but there still might be 
some unclear areas:

http://www.w3.org/TR/xquery-operators/#regex-syntax
"Reluctant quantifiers are supported. Specifically:

     * X?? matches X, once or not at all
     * X*? matches X, zero or more times
     * X+? matches X, one or more times"

So the example
  fn:tokenize("abba", ".?") returns ("a", "b", "b", "a")
should be changed to
  fn:tokenize("abba", ".??") returns ("a", "b", "b", "a")
?

What does tokenize return with ".?" then? The empty sequence or a 
sequence of zero-length strings? And with "."?

I'd still add
  fn:tokenize("abba", "") returns ("a", "b", "b", "a")
since it's most clear. I can't see why someone would write something 
other than "" or ".{0}" to match the empty string.

Tobi

-- 
http://www.pinkjuice.com/

Received on Tuesday, 19 August 2003 04:15:35 UTC