Re: Missing functions in XQuery 1.0 and XPath 2.0 Functions and O perators from Erik Bruchez on 2003-10-14 (public-qt-comments@w3.org from October 2003)

From: Erik Bruchez <erik@bruchez.org>
Date: Tue, 14 Oct 2003 12:46:39 -0700
To: "Kay, Michael" <Michael.Kay@softwareag.com>, public-qt-comments@w3.org
Message-ID: <3F8C529F.2020201@bruchez.org>

 > Obviously there are hundreds (thousands!) of functions that we could
 > have included but have not. We have consciously tried to keep the
 > library as small as possible, and in fact some people complain that
 > it is already much too large compared with XPath 1.0.

I guess the approach I would have chosen would have been to look at
existing libraries, for example the Java String and Pattern/Matcher
classes, and try to match the functionality available there (which is
there most of the time for a reason).

 > With string handling functions, we have been guided by experience
 > with XPath 1.0. There are a few things that XPath 1.0 users find
 > difficult to achieve, and we have plugged most of these gaps. In my
 > experience most of the use cases for indexOf() in Java are so that
 > you can then use substring() to do the equivalent of XPath's
 > substring-before() and substring-after().

While substring-before() and substring-after() are useful and mirror
each other, they both start searching at the beginning of the
string. Java's lastIndexOf() starts searching at the end of the
string. I am not particularly looking for functions returning the
position of the searched substring, and I would be happy to have a
substring-before() and substring-after() that can also start from the
end of the string. In my experience cases where you need to search
from the end are almost as frequent as cases where you need to search
from the beginning.

 >>2) A regexp function that simply returns a sequence of matching
 >>groups. I know that XSLT 2.0 provides xsl:analyze-string which
 >>provides advanced matching functionality, but given the presence of
 >>fn:matches, fn:replace and fn:tokenize in XPath, there is room for
 >>such a function in XPath itself.
 >
 > Can you provide a use case?

Sure. Let's say my regexp is:

   "/some-path/([0-9]*)/xyz/([0-9]*)"

I want to extract the first group and the second group. Java's Matcher
class gives you access to the matched groups by index, e.g. if my
input string is "/some-path/1234/xyz/12", the first group will return
"1234" and the second group "12". In Java you access them with
matcher.group(0) and matcher.group(1).

The function would looke like this:

fn:matching-groups($input as xs:string?,
                    $pattern as xs:string) as xs:string*

 >>3) There is fn:escape-uri, but no unescape-uri. Java for example
 >>provides an URLDecoder and an URLEncoder.
 >
 > This has been requested before, but I'm not aware of any context
 > where unescaped URIs (which are not URIs at all, of course) are
 > needed. Do you know of such a context?

The idea is to be able to decode an already encoded ("escaped" in the
current terminology) URI. If you receive a query string from a Web
browser, you may need to decode its parameters. Typically this task is
done by your Web server, but your application may happen to have to
handle non-decoded URI parameters. In addition to this, there is a
reason of symmetry: if you provide a function to encode, you may as
well provide a function to decode.

-Erik

Received on Tuesday, 14 October 2003 15:46:49 UTC