W3C home > Mailing lists > Public > public-qt-comments@w3.org > August 2003

Re: [xslt2 func/op] tokenizing "abba" to ("a","b","b","a")

From: Tobias Reif <tobiasreif@pinkjuice.com>
Date: Tue, 19 Aug 2003 10:13:58 +0200
Message-ID: <3F41DC46.7070305@pinkjuice.com>
To: public-qt-comments@w3.org
CC: Ashok Malhotra <ashokma@microsoft.com>

Ashok Malhotra wrote:
 > Yes, but the spec says that if reluctant quantifiers are used, i.e.
 > those with ?, then the regex "matches the shortest possible substring
 > consistent with the match as a whole succeeding."

This doesn't seem to apply to the regex in the example in the spec, 
namely ".?".

It's up to you and I don't want to annoy you, but there still might be 
some unclear areas:

http://www.w3.org/TR/xquery-operators/#regex-syntax
"Reluctant quantifiers are supported. Specifically:

     * X?? matches X, once or not at all
     * X*? matches X, zero or more times
     * X+? matches X, one or more times"

So the example
  fn:tokenize("abba", ".?") returns ("a", "b", "b", "a")
should be changed to
  fn:tokenize("abba", ".??") returns ("a", "b", "b", "a")
?

What does tokenize return with ".?" then? The empty sequence or a 
sequence of zero-length strings? And with "."?

I'd still add
  fn:tokenize("abba", "") returns ("a", "b", "b", "a")
since it's most clear. I can't see why someone would write something 
other than "" or ".{0}" to match the empty string.

Tobi

-- 
http://www.pinkjuice.com/
Received on Tuesday, 19 August 2003 04:15:35 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:45:13 UTC