- From: Jim Melton <jim.melton@acm.org>
- Date: Mon, 18 Aug 2003 16:28:22 -0600
- To: Tobias Reif <tobiasreif@pinkjuice.com>
- CC: public-qt-comments@w3.org, Jeni Tennison <jeni@jenitennison.com>
Tobias, Tobias Reif wrote: > > Hi Jeni > > > The definition of the fn:tokenize() function says: > > > > "If the supplied $pattern matches a zero-length string, the > > fn:tokenize() function breaks the string into its component > > characters. The nth character in the $input string becomes the nth > > string in the result sequence; each string in the result sequence > > has a string length of one." > > Exactly. > > > In the example above, the pattern ".?" is a pattern that matches a > > zero-length string; > > But it matches more than a zero-length string AFAICS. It is not > explicitly specified what happens when the pattern matches more than > the zero-length string. IMHO returning an empty sequence is the only > consistent behaviour; anything else is hard to specify unambiguously. My reading of the sentence in question is that it is irrelevant whether or not the pattern matches anything other than a zero-length string. The only important aspect of the pattern in *this* *particular* rule is whether it matches the zero-length string. If it does, then that's the end of the story, and there's no reason to check whether it matches anything else...the fact that it matches the zero-length string clearly specifies the semantics: break the string into its component characters. If, and only if, the pattern does not match the zero-length string does it become meaningful and necessary to find out what it does match and perform accordingly. That is why the rule was not written to read "If the supplied $pattern matches a zero-length string and nothing else,...". While the chosen semantics might be different from your (apparent) favorite language, ruby, they are consistent with XML Schema (which uses similar wording) and they are the semantics chosen for this language. > As I described it is not specific enough, leaving room for different > interpretations and thus differing implementations. I disagree. We state early in the F&O specification that the rules are to be applied *in the order in which they are written*. If you do that, and read the rule in question properly (that is, without adding the incorrect "...and nothing else" in your mind), then the spec is unambiguous (in this respect, that is). Frankly, if we start adding "...and nothing else..." to rules like this, the spec will become even more difficult to read than it already is! Nonetheless, we appreciate comments and suggested improvements! Hope this helps, Jim
Received on Monday, 18 August 2003 18:25:19 UTC