Re: New XPath draft from Reece Dunn on 2020-12-14 (public-xslt-40@w3.org from December 2020)

From: Reece Dunn <msclrhd@googlemail.com>
Date: Mon, 14 Dec 2020 07:50:52 +0000
To: Dimitre Novatchev <dnovatchev@gmail.com>
Cc: Michael Kay <mike@saxonica.com>, public-xslt-40@w3.org
Message-ID: <CAGdtn27kKUEpuV0Q2GtsSbPQan4yiP9WaYU_VRbDQYfQjDzC5Q@mail.gmail.com>
On Sun, 13 Dec 2020 at 23:12, Dimitre Novatchev <dnovatchev@gmail.com>
wrote:

>
>
> On Sun, Dec 13, 2020 at 2:14 PM Reece Dunn <msclrhd@googlemail.com> wrote:
>
>> On Sun, 13 Dec 2020 at 21:07, Dimitre Novatchev <dnovatchev@gmail.com>
>> wrote:
>>
>>>
>>>
>>>>
>>>>
>>>>> *Quote*:
>>>>>
>>>>> "%variadic("sequence") indicates that the function is
>>>>> sequence-variadic. A sequence-variadic function declares one or more parameters,
>>>>> of which the last typically has an occurrence indicator of * or + to
>>>>> indicate that a sequence may be supplied.
>>>>>
>>>>>  *Questions*:
>>>>> 1. Why sequence-variadic and not array-variadic?
>>>>>
>>>>> If an array is used, then it can hold for example:
>>>>>     [1, (), 2, 3]
>>>>>
>>>>> and all 4 elements of the array will be accessible from the function
>>>>> call,  not just 3 as in the case when a sequence is passed:
>>>>>     (1, (), 2, 3)
>>>>>
>>>>> Thus having array-variadic functions (function calls) is more precise
>>>>> and expressive, as shown in this example.
>>>>>
>>>>
>>>> In the "Variadic functions and dynamic function calls" thread, Michael
>>>> Kay explained that this is due to `[]` being ambiguous -- is it a single
>>>> argument to the variadic version and thus passed as `[ [] ]`, or is it the
>>>> value of the parameter itself? In that thread, I've proposed a
>>>> fn:function-arguments function as a way to get the array version of the
>>>> parameters if required:
>>>>
>>>>     fn:function-arguments($from as xs:integer := 1, $to as xs:integer?
>>>> := ()) as array(*)
>>>>     (: This function is ·deterministic·, ·context-dependent·, and
>>>> ·focus-independent·. :)
>>>>
>>>
>>> Actually, there isn't any ambiguity:
>>>
>>> [ [] ]     is an array with a single item in it, which is the empty
>>> array (arrays in this respect are different from sequences and do not
>>> remove/flatten their empty items),
>>> while [] is the empty array (containing 0 items).
>>>
>>
>> The ambiguity happens at the point at which the function is called. If
>> the last argument is an array, then passing a single array value could be
>> interpreted as passing that array to the function (`[]`) or applying the
>> array-valiadic logic and constructing an array of that parameter and then
>> passing that (`[ [] ]`) to the array parameter. With sequences (as noted by
>> Michael Kay), they don't have this issue, as `T` and `(T)` are equivalent.
>>
>
> What issue? If the last positional parameter/argument is of type array,
> then the last two of the list of arguments in the "effective call" will
> both be arrays and there is no ambiguity in their interpretation: the last
> but one argument (array) is the last positional argument, and the last
> argument (array) contains the variadic arguments.
>

If you have the following code:

    declare function f($test as array(*)) { $test };
    f(), f(1), f([1]), f(1, 2), f(1, [2])

what is the output?

For the first call `f()`, it will use the variadic rules to pass an empty
array to the function, so will evaluate to `[]`. Likewise, for the fourth
and fifth calls (`f(1, 2)`, and `f(1, [2])`), both of these will also use
the variadic rules as there is more than one argument.

The tricky case is with the single argument as demonstrated in the second
and third call. In these cases, there are two possibilities: either a) they
are using the variadic rules -- in which case they will result in `[1]` for
the second call and `[ [1] ]` for the third -- or b) they are using the
function matching rules, where `1` is not an array but `[1]` is, so both
would result in `[1]`. The ambiguity is now clear: when passing a single
argument which is an array, should the variadic rules apply or not? If they
don't, how does a user pass an array to the function?


>
>
>>
>> Note: There is an equivalent potential issue with map-variadic functions.
>> Given a function `f($options as map(*))`, how will the function call
>> `f($options := map {})` be interpreted? -- Is that passing an empty map to
>> the map parameter (via the rule for naming parameters), or constructing a
>> new `map { "options": map {} }` object (using the map-variadic rules)?
>>
>>
>>>
>>>>
>>>> 2. If the only reason is that fn:concat()  cannot be expressed/called
>>>>> as an array-variadic function, can we have both: array-variadic and
>>>>> sequence-variadic (the latter just for fn:concat() ) defined?
>>>>>
>>>>
>>>> It can be expressed as either, however the value flattening for
>>>> fn:concat is more intuitive when defined as a sequence-variadic function.
>>>> See the explanation above.
>>>>
>>>
>>> As per the conclusion above,  that actually there is no ambiguity, a
>>> call to fn:concat() cannot be expressed as an array-variadic call. I
>>> suspect that this is the only standard function with this problem.
>>> Therefore the raised question remains valid and needs an answer.
>>>
>>
>> If an fn:concat function is array-variadic (provided array-variadic
>> functions are supported), it can be implemented in terms of the `for member
>> ...` expression, or another way of enumerating over the parameters. It is
>> more complicated to get the same sequence-style flattening logic using
>> arrays, but not impossible. That is, you could use something like:
>>
>>     declare (: %variadic("array") :) function fn:concat($args as
>> array(xs:anyAtomicType?)) as xs:string {
>>         string-join(for member $arg in $args return string-join($arg))
>>     };
>>
>> and that would have the same semantics as a version that is
>> sequence-variadic, which could be implemented as:
>>
>>     declare (: %variadic("sequence") :) function fn:concat($args as
>> xs:anyAtomicType*) as xs:string {
>>         string-join($args)
>>     };
>>
>>
> Great, then this means there is no reason why map-variadic function calls
> should exist. All of these can be record-variadic, thus more amenable to
> static type-checking.
>

I'm sorry, I don't follow your line of reasoning here. How does equivalence
between array- and sequence-variadic functions lead to map-variadic
functions not being needed? -- The map-variadic functions are for named
arguments beyond those defined by parameter names, like the options in
fn:serialize, whereas array/sequence-variadic functions are for positional
arguments beyond the number of function parameters.

Arrays and maps are not equivalent in XPath/XQuery like they are in
languages like PHP where the array indices are keys in a map. Note: I
wouldn't want XPath/XQuery to adopt that array and map equivalence as it
prevents arrays being mapped onto an actual array/vector type in an
implementation and would remove the option for a lot of possible
optimizations.

>
>
>>
>>>
>>>>
>>>>
>>>>> 2. Is there any example of an existing standard function that cannot
>>>>> be expressed/called as a record-variadic function, but can be called as a
>>>>> map-variadic function?
>>>>>
>>>>
>>>> I think it would be useful to define the map-variadic standard
>>>> functions using RecordTests, as that would permit static-time error
>>>> checking.
>>>>
>>>
>>> Exactly, and this is why we should give it the precise name:
>>> "record-variadic"!
>>>
>>
>> But map/record-variadic functions are the same in how the argument names
>> are built into a map (just like we don't have a record {...} constructor
>> for record types).
>>
>
> But the major advantage of record-variadic functions is the stricter
> type-checking of the variadic arguments provided in the function call.
>

I agree that using a record offers stricter type-checking, and that is
useful in the common case of using map-variadic functions for options or
similar. However, that doesn't mean that we should require all named
argument based variadics to use the record type. (See my replies below.)


>
>>
>>
>>
>>>
>>>> I don't think we should be restricting this to how these are used in
>>>> standard functions (otherwise you end up with a situation like with
>>>> annotations where users can add their own annotations, but the language
>>>> doesn't provide mechanisms to take advantage of them without vendor
>>>> extensions/functions). For example, someone could create a json-object
>>>> function that is map-variadic as it can take any parameters. -- Given
>>>> Michael Kay's comments about someone not being able to find an array
>>>> function, it may be useful to add an fn:array and fn:map function, in which
>>>> case that would be an example of a map-variadic function that cannot use a
>>>> RecordTest.
>>>>
>>>
>>> Sorry, you lost me here :(
>>>
>>
>> If we only define/allow record-variadic functions because no map-variadic
>> functions are defined for standard functions, that would prevent users from
>> defining/using map-variadic functions if needed -- that is what I meant by
>> "restricting this to how these are used in standard functions".
>>
>
> I asked for an example of even one function that can be declared as
> map-variadic but cannot be declared as record variadic. If we have such an
> example, then this would justify using map-variadic functions. If not
> (which is the case at present), then there is no justification to declare a
> function as map-variadic and not as a record-variadic.
>

I gave you examples below:

    (: Define a constructor function for maps, similar to the xs:*
functions, and fn:QName. :)
    declare function fn:map($map as map(*)) { $map };

    (: Serialize the map as XML, e.g. making the keys elements. :)
    declare function map-to-xml ($map as map(*)) as element() { (: ... :) };

    (: A new example for building an XML element with any attributes. :)
    declare function new-element($name as xs:string, $attributes as map(*))
{ (: ... :) };

In these examples you cannot define them as record-variadic in a meaningful
way. Specifically, you can call them like:

    fn:map($a := 1), fn:map($b := 2)

How do you define a record type for this? Specifically, any key/value pairs
are allowed in this fn:map example so you would need something like:
`record(optional?, *)` -- i.e. the record has an optional key called
"optional" (but could be called anything, as the name/type of this key does
not matter here), and is extensible so any other key/value combination is
permitted. At that point, it is easier and more concise to just write
`map(*)` for the parameter's type as it more accurately conveys the meaning.

This was my point about not constraining the language because the
requirement is not there in the standard functions. That is, just because
all the standard functions may be able to be defined as record types does
not mean that we should be restricting this to just record types.

---

Requiring the variadic functions that allow named arguments to be record
types breaks the goal of this being able to be used with existing functions
(e.g. user-defined functions) that take a map-based options parameter. --
This also applies to thirdparty functions from libraries/packages, where a
user has less control over updating them.

Kind regards,
Reece
Received on Monday, 14 December 2020 07:51:18 UTC