Re: Re-Proposal for variadic functions - the good the bad and the unacceptable from Reece Dunn on 2020-12-10 (public-xslt-40@w3.org from December 2020)

From: Reece Dunn <msclrhd@googlemail.com>
Date: Thu, 10 Dec 2020 09:27:52 +0000
To: Dimitre Novatchev <dnovatchev@gmail.com>
Cc: public-xslt-40@w3.org, Michael Kay <mike@saxonica.com>, "Liam R. E. Quin" <liam@fromoldbooks.org>
Message-ID: <CAGdtn24d3=spH+c6Z=zRsPJagG=dVF4YTJ48korK349gqSnW_g@mail.gmail.com>
On Thu, 10 Dec 2020 at 02:24, Dimitre Novatchev <dnovatchev@gmail.com>
wrote:

> Here is a summary feedback for the recent submission by Dr. Kay that, in
> his words, is “an attempt to consolidate a proposal for variadic
> functions that combines all the ideas and requirements that have been
> expressed”.
>
>
> *The not so good*
>
> The obvious things that are not good in the proposal:
>
> ·         The arity of a variadic function is undefined and it cannot
> have overloads – only a single function with that name can exist. There is
> no obvious, specific reason for such design decision, the practice of other
> programming languages shows that functions allowing calls with keyword
> arguments, can have additional overloads. See for example Jon Skeet’s “C#
> in Depth” / Overloading (section: Optional parameters (
> https://csharpindepth.com/articles/Overloading)). Here one can see at
> least two ways of disambiguating seemingly ambiguous method-calls –
> disambiguation by explicitly specifying the optional parameters, and
> disambiguation by argument names.
>

I think it would make sense to make it such that as long as the minimum and
maximum bounds of a function do not overlap (where the bounds are equal for
non-variadic functions), then they can be overloaded, otherwise an XQST0034
error is raised (like currently when trying to define a function with the
same arity). It gets tricky for map-variadic functions,but I think that can
be solved by treating them like array-variadic functions -- that is, the
maximum bound is infinite/unbounded (within architectural limits).


> ·         The function reference of a variadic function. Obviously
> someFunName#*  tells us nothing about the possible number of arguments,
> and the value of the arity is passed to other functions as nothing (the
> empty sequence ()   )  , even though in all three types of proposed
> variadic functions, the exact number of positional argument is known, and
> in the case of a “bounded variadic function " the maximum number of
> optional arguments in a function-call is also exactly known
>

If you want to bind to a specific arity you would use name#arity as you
currently do, and that would bind to the variadic function at that specific
arity.

The question is whether it makes sense to bind to the actual function
definition. I think it does for an array-variadic or map-variadic function
-- where you may want access to a function that takes an array or map as
its last argument, but not for the bound-variadic function unless you are
referencing the version of the bound-variadic function where all parameters
are required. Using name#arity with arity set to the number of parameters
will work for the bound-variadic function, but for the array- and
map-variadic functions the last parameter would be a different type.

So how do you then reference the non-variadic versions of the array- or
map-bound functions? Using #* in this context would make sense provided
there can only be a single function of the given QName when defining
variadic functions. The question then is whether the resulting reference is
itself variadic (like is implied by #*) or if it is non-variadic.

---

An alternative would be to make it so that when using a name#arity
reference with the same arity of the function, that
a) for bound-variadic functions it is no longer variadic (as all arguments
are now required due to specifying the arity);
b) for array--variadic functions it binds to the version accepting an array
or an item type matching the array's item type (i.e. a single array value)
-- i.e. a union-of(array(T), T) parameter, which is needed as both types
are valid at the given arity;
c) for map-variadic functions it binds to the version accepting a map or
record type -- it does not make sense to have the last parameter support a
named argument as the function is no longer variadic.

In this case the resulting function reference would not be variadic, so
function-arity would not need to support an empty-sequence return type in
the case where the function is variadic. This would make function call
dispatch trickier, as you would need to know if the function is variadic or
not, instead of just checking the last parameter is an array or map and
then applying the variadic rules there. However, I suspect that this would
be similar to a situation where a user uses argument placeholders -- that
is, if a function is array- or map-variadic and has 2 parameters, then f#2
and f(?, ?) would be equivalent, and f(2, ?) would result in a similar
situation.


> ·         “References to virtual fixed-arity functions that map directly
> to the variadic function”. The author had to do this in order to overcome
> the two bad things above – thus doing a 3rd bad thing…
>

I don't think this is a bad thing (as you can bind to a function at a given
fixed arity currently), I just think that it is loosely specified using
informal terminology (as per my attempt at providing a more formal
definition in terms of equivalence to creating an inline function
expression in the other thread). Not being able to do this would break
backwards compatibility, and would prevent a user being able to get a fixed
arity reference to a variadic function (so would not be able to do things
like passing concat#2 to a function).


> ·         In “map-variadic” functions there is no possibility for static
> type checking. This provides for unlimited space of runtime
> “confusion-errors” where the free-text English language description is
> understood differently by the author of the function and by its users.
> Natural language ambiguity is best avoided when doing serious function
> design work.
>
I think it makes sense to support record types for map-variadic functions.
That way, if you want the map-based parameters to be clearly defined, you
can through that mechanism (while opting in to allowing other parameters if
you make the record extensible/open ended). If you don't want that, you can
specify a map type.

A map type would be useful if you wanted to do things like defining json
object constructor functions (with the parameters being the object keys),
or xml/html contructor functions (with the additional parameters being the
attributes). If you wanted more verification (e.g. in a JSON-LD library),
you could use record types.

> *The unacceptable*
>
>
>
> Some of the bad things are so bad that they are unacceptable:
>
> ·         It is unacceptable to define the arity of a function as nothing
> (the empty sequence () or *) when the reader clearly sees that the function
> has M positional parameters and N optional parameters. People who believe
> what they see will be confused and upset.
>

A better return type for variadic functions would be a record with a min
value for the minimum bound and max for a maximum bound. If the function is
array- or map-variadic, maximum bound would be an empty sequence to denote
it being unbounded/infinite.

If using my alternative proposal this problem would not exist as it would
not be possible to create a reference to the variadic version of a function.

It may be useful to have a new `fn:function-arities($function-name as
xs:QName) as union-of(record(min: xs:integer, max: xs:integer?),
xs:integer)*` function that returns all the arities of functions in the
static context. For variadic functions it will return a map with a min
value for the minimum value, and a max value if an upper bound is specified
(for bounded-variadic functions).


> ·         It is unacceptable to forbid a variadic function to have other
> overloads, contrary to the practice in other programming languages. If this
> can be done in C#, why shouldn’t it be done here?
>

If using Michael Kay's proposal of using #* to bind to the underlying
function (to access the array/map last parameter) it is not possible. Using
my alternative proposal, it would be possible. I've elaborated further
above.


> ·         It is unacceptable to introduce vague, confusing terms such as “virtual
> fixed-arity functions“ just in attempt to fix the holes left by the
> document in its current form.
>

I think this is because the language/wording hasn't been formalised, and
that was a short-hand for a more formal definition that would come later
(like the one I wrote in the other thread). The main thing is assessing the
overall approach.


> ·         It is unacceptable to prevent any static type-checking in the
> case of “map-variadic” functions. An obvious improvement would be
> “record-variadic functions”.
>

I agree that records should be allowed in map-variadic functions. That
wouldn't be a record-variadic function though, as records are supposed to
be usable wherever maps are.


> ·         It is unacceptable not to provide any disambiguation mechanisms
> when there are overloads that may be ambiguous if allowed. Based on
> existing programming languages, there are at least two obvious ways of
> disambiguation: by arity and by argument name.
>
I think it makes sense to keep the (arity, function name) mechanism as the
way to disambiguate functions. In the context of variadic functions, the
arity here is now a range defined by the bounds of the function.

Kind regards,
Reece


> Thanks,
>
> Dimitre
>
>
> Attachment: this text as a pdf file, in case it is not well readable on
> the w3 web server
>
>
>
Received on Thursday, 10 December 2020 09:28:19 UTC