- From: Erik Bruchez <erik@bruchez.org>
- Date: Tue, 14 Oct 2003 12:46:39 -0700
- To: "Kay, Michael" <Michael.Kay@softwareag.com>, public-qt-comments@w3.org
> Obviously there are hundreds (thousands!) of functions that we could > have included but have not. We have consciously tried to keep the > library as small as possible, and in fact some people complain that > it is already much too large compared with XPath 1.0. I guess the approach I would have chosen would have been to look at existing libraries, for example the Java String and Pattern/Matcher classes, and try to match the functionality available there (which is there most of the time for a reason). > With string handling functions, we have been guided by experience > with XPath 1.0. There are a few things that XPath 1.0 users find > difficult to achieve, and we have plugged most of these gaps. In my > experience most of the use cases for indexOf() in Java are so that > you can then use substring() to do the equivalent of XPath's > substring-before() and substring-after(). While substring-before() and substring-after() are useful and mirror each other, they both start searching at the beginning of the string. Java's lastIndexOf() starts searching at the end of the string. I am not particularly looking for functions returning the position of the searched substring, and I would be happy to have a substring-before() and substring-after() that can also start from the end of the string. In my experience cases where you need to search from the end are almost as frequent as cases where you need to search from the beginning. >>2) A regexp function that simply returns a sequence of matching >>groups. I know that XSLT 2.0 provides xsl:analyze-string which >>provides advanced matching functionality, but given the presence of >>fn:matches, fn:replace and fn:tokenize in XPath, there is room for >>such a function in XPath itself. > > Can you provide a use case? Sure. Let's say my regexp is: "/some-path/([0-9]*)/xyz/([0-9]*)" I want to extract the first group and the second group. Java's Matcher class gives you access to the matched groups by index, e.g. if my input string is "/some-path/1234/xyz/12", the first group will return "1234" and the second group "12". In Java you access them with matcher.group(0) and matcher.group(1). The function would looke like this: fn:matching-groups($input as xs:string?, $pattern as xs:string) as xs:string* >>3) There is fn:escape-uri, but no unescape-uri. Java for example >>provides an URLDecoder and an URLEncoder. > > This has been requested before, but I'm not aware of any context > where unescaped URIs (which are not URIs at all, of course) are > needed. Do you know of such a context? The idea is to be able to decode an already encoded ("escaped" in the current terminology) URI. If you receive a query string from a Web browser, you may need to decode its parameters. Typically this task is done by your Web server, but your application may happen to have to handle non-decoded URI parameters. In addition to this, there is a reason of symmetry: if you provide a function to encode, you may as well provide a function to decode. -Erik
Received on Tuesday, 14 October 2003 15:46:49 UTC