- From: Priscilla Walmsley <priscilla@walmsley.com>
- Date: Mon, 18 Aug 2003 19:21:06 -0400
- To: "'Jim Melton'" <jim.melton@acm.org>, "'Tobias Reif'" <tobiasreif@pinkjuice.com>
- Cc: <public-qt-comments@w3.org>, "'Jeni Tennison'" <jeni@jenitennison.com>
Hi, Just to pick nits, Jim Melton wrote: > I disagree. We state early in the F&O specification that the > rules are > to be applied *in the order in which they are written*. If > you do that, > and read the rule in question properly (that is, without adding the > incorrect "...and nothing else" in your mind), then the spec is > unambiguous (in this respect, that is). The rule about matching a zero-length string appears *after* the sentence: "This function breaks the $input string into a sequence of strings, treating any substring that matches $pattern as a separator. The separators themselves are not returned." Perhaps this sentence is not an official "rule", just a general description of the function. However, the sentence is false in the case we are talking about. The letters "a", and "b" _do_ match the pattern .* , and therefore should be treated as separators according to this particular sentence. Maybe the sentence should start with "If $pattern does not match a zero-length string, ..." Or did I misunderstand what you mean by applying the rules in the order in which they are written? Anyway, back to the real issue, I think this behavior is particularly confusing in the case of: fn:tokenize("abba", "b?") which apparently would also return ("a", "b", "b", "a"), since the pattern matches a zero-length string. I think the user would expect "b" to be treated like a separator in this case. I know they could just use the pattern "b" if that's what they want, but it still seems like the function violates the principle of least surprise. Particularly since the ? in this case should be greedy. Thanks, Priscilla
Received on Monday, 18 August 2003 19:21:16 UTC