Re: Variadiic functions from Reece Dunn on 2022-09-27 (public-xslt-40@w3.org from September 2022)

From: Reece Dunn <msclrhd@googlemail.com>
Date: Tue, 27 Sep 2022 09:13:32 +0100
To: Michael Kay <mike@saxonica.com>
Cc: public-xslt-40@w3.org
Message-ID: <CAGdtn25uB=3pOmyGO+Jdt=ydxQ6LGYmkTLfHUOFjU9ViKxggqA@mail.gmail.com>
On Tue, 27 Sept 2022 at 07:51, Michael Kay <mike@saxonica.com> wrote:

> I've been trying to think of an alternative way of formulating the rules
> for variadic functions in a way that is less complex and confusing, but
> retains all the functionality. Here's my attempt:
>
> <proposal>
>
> A `function specification` (for example, a function declaration in XQuery,
> an xsl:function declaration in XSLT, or a specification entry in the F&O
> spec) declares a `function family`. A function family is a set of functions
> that share the same name, and that has a range of permitted arities:
> specifically, a minimum arity which is a finite non-negative integer, and a
> maximum arity which is an integer in the range 0 to infinity. The static
> context contains a set of function families; if two function families in
> the static context have the same name, then their arity ranges must not
> overlap.
>
> A function specification contains a list of zero or more parameter
> declarations. Each parameter declaration has a property called its
> plurality, which takes one of the values `single`, `multiple`, `optional`,
> or `mapped`. The list of parameter declarations in a function specification
> must comprise zero or more single parameters, followed by zero or one
> multiple parameters, followed by zero or more optional parameters, followed
> by zero or one mapped parameters.
>
> The minumum arity of a function specification is the number of parameter
> declarations with plurality = single.
>
> The maximum arity of a function specification is infinity if there is a
> parameter declaration whose plurality is `multiple` or `mapped`; in other
> cases the maximum arity is the number of parameter declarations.
>
> A static function call is bound to a function family in the static context
> by considering the function name and the number of supplied arguments. The
> function call supplies a number of positional arguments followed by a
> number of keyword arguments. The keywords are NCNames and must be distinct.
> The function name must match the name of a function family in the static
> context, and the total number of supplied arguments in the static function
> call (counting both positional arguments and keyword arguments) must fall
> within the arity range of one of those function families.
>
> The arguments supplied in a static function call are matched to the
> declared parameters of the function family as follows:
>
> * Starting from the first positional argument, arguments are matched in
> order to the parameter declarations with plurality=single.
>
> * If there is a parameter declaration with plurality=multiple, then any
> remaining positional arguments are aggregated as a sequence and matched to
> that parameter declaration.
>
> * If there is no parameter declaration with plurality=multiple, then any
> remaining positional arguments are matched individually, in order, to the
> remaining parameter declarations, regardless of their plurality
>
> * Starting from the first keyword argument, arguments are matched by name
> to the first parameter declaration (of any plurality) that has not already
> been matched. In this matching process, the supplied keyword (an NCName) is
> considered to represent a no-namespace QName value whose local part is the
> supplied keyword.
>
> * If there are keyword arguments that do not match the name of any
> parameter declaration, and if there is a parameter declaration with
> plurality=mapped, and if the parameter declaration with plurality=mapped
> has not been matched by one of the forgoing rules, then these keyword
> arguments are aggregated to form a map, and the resulting map is matched to
> the mapped parameter declaration. In this process the keywords are treated
> as xs:NCName values (not as no-namespace QNames).
>
> * It is an error if there is a parameter declaration with plurality=single
> that remains unmatched [?is this possible?]. If a parameter with
> plurality=optional is unmatched, then it takes its default value from the
> parameter declaration. If a parameter with plurality=multiple is unmatched,
> its value is an empty sequence. If a parameter with plurality=mapped is
> unmatched, its value is an empty map.
>
> A function reference (`name#arity`) is matched to a function family in the
> static context whose name matches the supplied name and whose arity range
> includes the supplied arity.
>
> In dynamic function calls, all arguments are supplied positionally. The
> supplied arguments are matched in turn to the declared parameters of the
> function, irrespective of their plurality.
>
> </proposal>
>

Thanks for this. I've had an initial read through this. It looks
interesting, but I need to think through it to get a better handle of how
it will work.

The main thing that jumps out at me are how these new rules work when there
are a mix of parameters. For example, MarkLogic has mixed functions that
look like:

declare function cts:correlation(
    $value1 as cts:reference,
    $value2 as cts:reference,
    $options as xs:string* := (),
    $query as cts:query? := (),
    $forest-ids as xs:unsignedLong* := ()
) as xs:double? external;

IIUC, I don't think this will work with the proposed rules -- specifically:
> * If there is a parameter declaration with plurality=multiple, then any
remaining positional arguments are aggregated as a sequence and matched to
that parameter declaration.
gives no way of specifying the parameters for $query and $forest-ids in the
above function example.

<commentary>
>
> First, the proposal is a little bit less rigorous than I would like, but I
> think it can be tightened up to remove any gaps and ambiguities.
>
> The main change from the current draft is that instead of categorizing
> functions (e.g. as sequence-variadic or map-variadic), we classify
> parameter declarations by their "plurality" (if anyone can think of a
> better name, you're welcome). This increases flexibility, for example we
> can now declare fn:total() (replacing fn:sum) to take a plurality=multiple
> input sequence followed by an optional argument with keyword "zero", and we
> can write total(1, 5, 6, zero="0"). (We can't do this with fn:sum, because
> the existing call sum(3, 0) wouldn't work as intended).
>
> I have omiited from this proposal the examples and explanations that are
> in the current text, but I think all of the examples remain valid, with
> minor changes to the terminology of the explanations,
>
> I haven't adopted Reece's idea to allow the keywords in a static function
> call to be something other than an NCName. Allowing a string literal would
> not be a problem, but I'm uncomfortable about QNames. The problem here is
> that unprefixed names mean one thing if matching an optional parameter, and
> something else if matching a mapped parameter. But the problem isn't
> insuperable.
>

The rationale for allowing QNames (EQNames in the grammar) is that the
Param in a function declaration supports QNames, so if when calling a
function the naming of a parameter is limited to NCNames, then you cannot
use that with function parameters that are QNames. -- This leads to an
asymmetry in the behaviour and results in an error that a user would expect
to work. (Note: In my proposal in issue #54 I've made specifying a QName
value for a map key an error.)

The rationale for allowing StringLiteral values is that map keys can
contain spaces but the identifier (NCName) construction does not support
spaces. Therefore, you cannot currently supply a value for these using the
proposed named parameter functionality.

The approach I've taken in issue #54 that defines this for NCNames
(unprefixed QNames) is that for parameter names it matches the expanded
QName (following the normal rules for QName matching), and for mapped keys
it uses the local name of the QName.

Note: I cannot find the rules for expanding NCNames of function declaration
parameters, but my understanding is that an NCName for parameters is in no
namespace. As such, the keyword name can use the same logic and thus they
will always be the same. Therefore, there should not be an issue relating
to QName expansion.

Kind regards,
Reece


> </commentary>
>
> Michael Kay
>
Received on Tuesday, 27 September 2022 08:13:57 UTC