Re: New XPath draft from Dimitre Novatchev on 2020-12-14 (public-xslt-40@w3.org from December 2020)

From: Dimitre Novatchev <dnovatchev@gmail.com>
Date: Mon, 14 Dec 2020 06:48:19 -0800
To: Reece Dunn <msclrhd@googlemail.com>
Cc: Michael Kay <mike@saxonica.com>, public-xslt-40@w3.org
Message-ID: <CAK4KnZdSUMZqiqZnUvOXJyFMym3SZB8VYF1rCXJyHDNX7ENUkQ@mail.gmail.com>
On Sun, Dec 13, 2020 at 11:51 PM Reece Dunn <msclrhd@googlemail.com> wrote:




> The ambiguity happens at the point at which the function is called. If the
>>> last argument is an array, then passing a single array value could be
>>> interpreted as passing that array to the function (`[]`) or applying the
>>> array-valiadic logic and constructing an array of that parameter and then
>>> passing that (`[ [] ]`) to the array parameter. With sequences (as noted by
>>> Michael Kay), they don't have this issue, as `T` and `(T)` are equivalent.
>>>
>>
>> What issue? If the last positional parameter/argument is of type array,
>> then the last two of the list of arguments in the "effective call" will
>> both be arrays and there is no ambiguity in their interpretation: the last
>> but one argument (array) is the last positional argument, and the last
>> argument (array) contains the variadic arguments.
>>
>
> If you have the following code:
>
>     declare function f($test as array(*)) { $test };
>     f(), f(1), f([1]), f(1, 2), f(1, [2])
>
> what is the output?
>
> For the first call `f()`, it will use the variadic rules to pass an empty
> array to the function, so will evaluate to `[]`. Likewise, for the fourth
> and fifth calls (`f(1, 2)`, and `f(1, [2])`), both of these will also use
> the variadic rules as there is more than one argument.
>
> The tricky case is with the single argument as demonstrated in the second
> and third call. In these cases, there are two possibilities: either a) they
> are using the variadic rules -- in which case they will result in `[1]` for
> the second call and `[ [1] ]` for the third -- or b) they are using the
> function matching rules, where `1` is not an array but `[1]` is, so both
> would result in `[1]`. The ambiguity is now clear: when passing a single
> argument which is an array, should the variadic rules apply or not? If they
> don't, how does a user pass an array to the function?
>
>

If there is such ambiguity, then this is one more reason to be able to have
this call:

f([1], *varargs *:= [])



>
>>
>>>
>>> Note: There is an equivalent potential issue with map-variadic
>>> functions. Given a function `f($options as map(*))`, how will the function
>>> call `f($options := map {})` be interpreted? -- Is that passing an empty
>>> map to the map parameter (via the rule for naming parameters), or
>>> constructing a new `map { "options": map {} }` object (using the
>>> map-variadic rules)?
>>>
>>>
>>>>
>>>>>
>>>>> 2. If the only reason is that fn:concat()  cannot be expressed/called
>>>>>> as an array-variadic function, can we have both: array-variadic and
>>>>>> sequence-variadic (the latter just for fn:concat() ) defined?
>>>>>>
>>>>>
>>>>> It can be expressed as either, however the value flattening for
>>>>> fn:concat is more intuitive when defined as a sequence-variadic function.
>>>>> See the explanation above.
>>>>>
>>>>
>>>> As per the conclusion above,  that actually there is no ambiguity, a
>>>> call to fn:concat() cannot be expressed as an array-variadic call. I
>>>> suspect that this is the only standard function with this problem.
>>>> Therefore the raised question remains valid and needs an answer.
>>>>
>>>
>>> If an fn:concat function is array-variadic (provided array-variadic
>>> functions are supported), it can be implemented in terms of the `for member
>>> ...` expression, or another way of enumerating over the parameters. It is
>>> more complicated to get the same sequence-style flattening logic using
>>> arrays, but not impossible. That is, you could use something like:
>>>
>>>     declare (: %variadic("array") :) function fn:concat($args as
>>> array(xs:anyAtomicType?)) as xs:string {
>>>         string-join(for member $arg in $args return string-join($arg))
>>>     };
>>>
>>> and that would have the same semantics as a version that is
>>> sequence-variadic, which could be implemented as:
>>>
>>>     declare (: %variadic("sequence") :) function fn:concat($args as
>>> xs:anyAtomicType*) as xs:string {
>>>         string-join($args)
>>>     };
>>>
>>>
>> Great, then this means there is no reason why map-variadic function calls
>> should exist. All of these can be *record-variadic*, thus more amenable
>> to static type-checking.
>>
>
> I'm sorry, I don't follow your line of reasoning here. How does
> equivalence between *array*- and sequence-variadic functions lead to
> map-variadic functions not being needed? -- The map-variadic functions are
> for named arguments beyond those defined by parameter names, like the
> options in fn:serialize, whereas array/sequence-variadic functions are for
> positional arguments beyond the number of function parameters.
>
>
The statement is to use *record-variadic* function calls (call them
record-variadic and not map variadic) -- nothing to do with using
array-variadic functions.

Is it clear now?




> Arrays and maps are not equivalent in XPath/XQuery like they are in
> languages like PHP where the array indices are keys in a map. Note: I
> wouldn't want XPath/XQuery to adopt that array and map equivalence as it
> prevents arrays being mapped onto an actual array/vector type in an
> implementation and would remove the option for a lot of possible
> optimizations.
>
>>
>>
>>>
>>>>
>>>>>
>>>>>
>>>>>> 2. Is there any example of an existing standard function that cannot
>>>>>> be expressed/called as a record-variadic function, but can be called as a
>>>>>> map-variadic function?
>>>>>>
>>>>>
>>>>> I think it would be useful to define the map-variadic standard
>>>>> functions using RecordTests, as that would permit static-time error
>>>>> checking.
>>>>>
>>>>
>>>> Exactly, and this is why we should give it the precise name:
>>>> "record-variadic"!
>>>>
>>>
>>> But map/record-variadic functions are the same in how the argument names
>>> are built into a map (just like we don't have a record {...} constructor
>>> for record types).
>>>
>>
>> But the major advantage of record-variadic functions is the stricter
>> type-checking of the variadic arguments provided in the function call.
>>
>
> I agree that using a record offers stricter type-checking, and that is
> useful in the common case of using map-variadic functions for options or
> similar. However, that doesn't mean that we should require all named
> argument based variadics to use the record type. (See my replies below.)
>
>
>>
>>>
>>>
>>>
>>>>
>>>>> I don't think we should be restricting this to how these are used in
>>>>> standard functions (otherwise you end up with a situation like with
>>>>> annotations where users can add their own annotations, but the language
>>>>> doesn't provide mechanisms to take advantage of them without vendor
>>>>> extensions/functions). For example, someone could create a json-object
>>>>> function that is map-variadic as it can take any parameters. -- Given
>>>>> Michael Kay's comments about someone not being able to find an array
>>>>> function, it may be useful to add an fn:array and fn:map function, in which
>>>>> case that would be an example of a map-variadic function that cannot use a
>>>>> RecordTest.
>>>>>
>>>>
>>>> Sorry, you lost me here :(
>>>>
>>>
>>> If we only define/allow record-variadic functions because no
>>> map-variadic functions are defined for standard functions, that would
>>> prevent users from defining/using map-variadic functions if needed -- that
>>> is what I meant by "restricting this to how these are used in standard
>>> functions".
>>>
>>
>> I asked for an example of even one function that can be declared as
>> map-variadic but cannot be declared as record variadic. If we have such an
>> example, then this would justify using map-variadic functions. If not
>> (which is the case at present), then there is no justification to declare a
>> function as map-variadic and not as a record-variadic.
>>
>
> I gave you examples below:
>
>     (: Define a constructor function for maps, similar to the xs:*
> functions, and fn:QName. :)
>     declare function fn:map($map as map(*)) { $map };
>
>     (: Serialize the map as XML, e.g. making the keys elements. :)
>     declare function map-to-xml ($map as map(*)) as element() { (: ... :)
> };
>
>     (: A new example for building an XML element with any attributes. :)
>     declare function new-element($name as xs:string, $attributes as
> map(*)) { (: ... :) };
>
> In these examples you cannot define them as record-variadic in a
> meaningful way. Specifically, you can call them like:
>
>     fn:map($a := 1), fn:map($b := 2)
>
> How do you define a record type for this? Specifically, any key/value
> pairs are allowed in this fn:map example so you would need something like:
> `record(optional?, *)`
>

This is exactly it: we can use a record in this example. But do we have an
example where we really cannot use a record for the variadic arguments?

From what I have read about the record type, using its extended form (with
'*') makes it possible to hold all variadic arguments in the record
instance.


> -- i.e. the record has an optional key called "optional" (but could be
> called anything, as the name/type of this key does not matter here), and is
> extensible so any other key/value combination is permitted. At that point,
> it is easier and more concise to just write `map(*)` for the parameter's
> type as it more accurately conveys the meaning.
>

"Convenience" can be dangerous -- in this losing the focus on static typing.


>
> This was my point about not constraining the language because the
> requirement is not there in the standard functions. That is, just because
> all the standard functions may be able to be defined as record types does
> not mean that we should be restricting this to just record types.
>
> ---
>
> Requiring the variadic functions that allow named arguments to be record
> types breaks the goal of this being able to be used with existing functions
> (e.g. user-defined functions) that take a map-based options parameter. --
> This also applies to thirdparty functions from libraries/packages, where a
> user has less control over updating them.
>
>
Actually there is no such "break" -- just use the "extended" record type
(with '*').


> Kind regards,
> Reece
>

Thanks, Reece!
Dimitre
Received on Monday, 14 December 2020 14:48:44 UTC