[Bug 8662] [XQuery 1.1] Need an easy way to define a recursive inline function

http://www.w3.org/Bugs/Public/show_bug.cgi?id=8662


Pavel Minaev <int19h@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |int19h@gmail.com




--- Comment #2 from Pavel Minaev <int19h@gmail.com>  2010-01-06 20:11:01 ---
(In reply to comment #1)
> There are two related possibilities here:
> 
> a. Allow definition of named, scoped, local functions, ie:
> 
> declare function outer($a)
> {
>   declare function f1($b)
>   {
>     $a, f2($b)
>   };
> 
>   declare function f2($b)
>   {
>     $b, f1($b)
>   };
> 
>   f1#1;
> };

What if I want a local function declaration at a deeper expression nesting
level? The above syntax doesn't seem to allow for it, and "local function ...
return ..." is effectively the above syntax adapted to allow arbitrary nesting.

> b. Many functional languages have a recursive "let" whose variables are all in
> scope for any of the bound expressions:
> 
> letrec $f1 := function($b) { $a, $f2($b) },
>        $f2 := function($b) { $b, $f1($b) }
> return $f1
> 
> I think I'm personally in favour of a solution like this.

This has the problem of also seemingly allowing (at least syntactically):

   letrec $x := $y, $y := $x

On the first glance this problem is already solved for globals, but that
solution, if applied to locals, would prohibit the mutually recursive function
declarations above. Furthermore, no simple rule can be devised, such as
"recursive reference is legal within function body only", as that can still be
abused:

   letrec $x := function(){$y}(), $y := function(){$x}()

The only case that can be easily statically guaranteed to be safe is the one
where variables are directly bound to inline functions, and all recursive
references are within bodies of said functions. At which point we get precisely
the same semantics as "local function ... return ..." with a different syntax
(which arguably seems to be more general than it actually is).

A more complicated version of the above is R5RS approach, where "letrec"
requires that "it must be possible to evaluate each <init> without assigning or
referring to the value of any <variable>". This, of course, also requires a
strict definition of what it takes to "refer" to the value of variable (as
opposed to merely referring to the variable itself), which is possible to do in
Scheme because it explicitly defines variables as identifiers bound to
locations holding values, not as identifiers bound directly to values. Coming
up with a similar rigorous definition for XQuery may be tricky.

An alternative is to require implementation to detect any infinite recursion in
initializers dynamically if it cannot prove it statically - the approach taken
by ML with "let rec". That would be an additional burden on implementation with
an unavoidable runtime penalty, however, and will be inconsistent with rules
for global variable bindings.


> 2) Reference to self
>    -----------------
> 
> I dislike the idea of a special named function which is only in scope
> sometimes, and has a different type signature depending on where it is in
> scope. If we went this route, I would prefer a special variable like $this or
> $self of type function(*), which contains a reference to the current function.

It could also be a keyword that just looks like a function call. It would be
desirable to have it properly typed, however, if only to require a static error
if it is called with incorrect arity.

Also, any such "$this" would effectively be part of the static context, and, so
far as I know, the existing practice is for accessors to static context parts
to be functions, rather than implicitly-defined variables (in fact, are there
any of the latter in the current XQuery 1.1?).

On the other hand, if type signature is made to be strictly corresponding to
the function, then perhaps a keyword would best convey the idea that this isn't
quite a "normal" function/variable, but has some magic associated with it.

> Another (possibly tangential but related) issue is the rules about not allowing
> a global variable that references itself. It seems odd to allow this:
> 
> declare function f() { 1, f() };
> 
> but not this:
> 
> declare variable $f := 1, $f;
> 
> Both the function f() and the variable $f evaluate perfectly well on an
> implementation using lazy evaluation.

But doesn't XQuery spec explicitly permit non-lazily-evaluating implementations
by design? If so, the above construct, if allowed, would effectively have
implementation-defined behavior with an extremely pronounced difference, with
that not being obvious.

Also, even on a lazy implementation that would allow it, it would result in an
infinite-length sequence. Such sequences don't seem to be explicitly prohibited
by XDM (one could argue that the definition of "a sequence of zero or more
items" includes "infinite"), but their non-existence is implied elsewhere. E.g.
the definition of "context size" in XQuery (and XPath) defines it as an
"integer greater than zero" - seemingly excluding "infinity" from the range of
valid values. As well, the definition of count() - nor any other standard
function - does not specify the effect should it be called on an infinite
sequence. And any reasonable and useful definition (i.e. anything more specific
than "entirely implementation-defined") would require lazy evaluation.


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Wednesday, 6 January 2010 20:11:02 UTC