RE: Higher-Order Functions in XPath 2.0

Hi Mike,

> Thanks for submitting this, Dimitre. I'm personally very keen on
> getting higher-order functions into the spec, because in the long run
> I think it would save us work, but I can't say I'm confident about it

> getting in at this round - mainly because someone is going to have to

> work out the ramifications on the type system.
> 

I'd be happy if I can help in any way.

> Syntactically, I think it can be done quite neatly.
> 
> To declare that an argument or result of a function is itself a
> function, we need to extend the syntax for types to describe a
> function. The syntax might be something like
> 
> function ( xs:string, xs:string* ) returning xs:integer

This seems to be the function definition in XSLT. What about other 
clients of XPath? Maybe it would be good to have function definitions
in XPath?

> 
> And one can then declare a higher-order function as
> 
> function (function(xsl:integer)returning xs:integer) returning
integer 


This is nice. Maybe it would be good to try to find less
verbose syntax.

A function's type is fully defined by its signature, therefore we could
write:

[function] f xs:string -> xs:string* -> xs:integer

(where '[' and ']' are used to enclose optional keywords that may be
ommited.)

and

[function] g (xsl:integer -> xs:integer) -> xs:integer 


> We then need a way to refer to a function as an object, without
> invoking it: let's say we can refer to it as ?f, so a higher-order 
> function call might look like
> 
> distinct($nodes, ?my:func)


Is there any reason why we can't simply write:

distinct($nodes, my:func)

At this moment the XPath engine will have the type definition of
"distinct" and of "my:func", so it will know enough not to look for a
node named "my:func" and enough to raise an error, should my:func be
anything else than the function having the expected signature (type).


> For usability, we probably also need some way of defining inline
> function definitions (let's call them "anonymous functions", because
>"lambda expressions" will put people off). I actually suspect we can 
> get by with the only anonymous functions being functions that take
the
> context item as an implicit argument: this would account for 90% of 
> usage, and people could use named functions in the other cases. We 
> could write such functions as ?(expr). (An alternative I've 
> previously proposed is [expr]). This would then allow:
> 
> distinct(//employee, ?(@ssn))
> 


Once again, it should be possible to write something like:

 distinct(//employee, . -> @ssn) 

I find this more readable, or even:

 distinct(//employee, @ssn) 

When an expression is enetered in place of a function-argument that has
exactly one argument, which must be a node-set, then the
processor could cast it to the anonymous function . -> @ssn  

> (One issue that has to be resolved is the binding of any variable
> references within an anonymous function. No doubt the functional
> programming languages have plenty of answers to this problem).
> 

Yes, someting like:

distinct(//employee, (x, y -> x * y))  

or maybe you mean 'partial application'?  

> I think that defining some of the higher-order functionality we need
> (in areas such as distinct, group-by, sorting, regular expressions)
> using a common mechanism of higher-order function calls would be a 
> much sounder approach than defining custom syntax for each one, quite
> apart from the benefits obtained by allowing users to define their
own
> higher-order functions. It would also make it easier for us to define

> the formal semantics of those constructs where we do want to provide 
> custom syntax, such as "for", "some", and "every".
> 

Completely agree. I think this will help the WG complete their work on
the language in half that time, have half that documentation,
make the language many times smaller, simpler and elegant.    

> There's also a link into the requirement for dynamic evaluation: the
> evaluate() function can be naturally modelled as a function that
> takes a string as input and returns an anonymous function as its
> result.

This is very similar to partial application, but while it will be lazy,
I think the evaluate function will be strict (will be evaluated
immediately)

Actually, it would be important to have support for partial
application.

So one will be able to write:

f($num, (* $x))

for f defined as:

function f xs:integer -> (xs:integer -> xs:integer ) -> xs:integer

This will accept (* $x) as a function of one argument that computes its
product with a fixed (stored at run time) value of $x.

Cheers,
Dimitre Novatchev.


__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/

Received on Friday, 18 January 2002 16:04:40 UTC