Re: Function naming: Problems and proposed solution from Michael Kay on 2020-11-28 (public-xslt-40@w3.org from November 2020)

From: Michael Kay <mike@saxonica.com>
Date: Sat, 28 Nov 2020 11:40:30 +0000
To: Dimitre Novatchev <dnovatchev@gmail.com>
Cc: public-xslt-40@w3.org
Message-Id: <CB2D59EA-0C7C-4418-B24B-CE8D5C009258@saxonica.com>
Other languages solve this problem by disambiguating functions based on the type of the arguments. You don't need different names for map:put() and array:put() if you know that the target of the function is an array or a map. So I've wondered if there's something we can do so that 

$m => put('a', 23)

is resolved to map:put from knowledge of the type of $m. This wouldn't necessarily have to imply a static binding; we could have dynamic semantics for resolving the unqualified name "put" in this context, with static resolution being an optimisation.

I've suggested that rather than unprefixed function names being resolved to a single default namespace, it should be possible to declare a set of namespaces whose functions can be referenced without a prefix, and the challenge then is how to resolve conflicts. One way of doing this is to say that an ambiguous unprefixed name (like put, which can be map:put or array:put) resolves to a "meta-function" that does dynamic despatch based on the dynamic type of the first argument: perhaps it should only be allowed if the candidate functions have disjoint declared types for their first argument (which rules out resolving fn:remove#2 vs map:remove#2 this way...) . In many cases of course it will still be possible to do the binding statically based on inferred types.


Michael Kay
Saxonica

> On 28 Nov 2020, at 11:16, Michael Kay <mike@saxonica.com> wrote:
> 
> I think there are certainly cases where this approach could be useful. And it ties in with what load-xquery-modules() does. The main drawback is that without some language support, there's no static checking of the function names or of their signatures.
> 
> Tuple types - or Record types as I'm now proposing to call them - possibly provide a way forward on this. For example the math functions could be declared as an instance of a record type
> 
> record ( pi as function() as xs:double,
>               sin as function(xs:double) as xs:double,
>               ... etc )
> 
> and if $math is statically known to conform to this type, then $math?sin is statically known to be a function with a particular signature. (But note, it's $math?sin, not $math?sin#1).
> 
> At present I'm not sure that this gives much benefit over using multiple namespaces, and it's a bit hard to see how to introduce it alongside existing mechanism without ending up with a rather clumsy mixture of different approaches coexisting.
> 
> I've been trying to find a way that still uses namespaces, but makes them more flexible. For example, by separating the namespace context for functions from that for elements, and/or making the binding of prefixes to namespaces more adaptable - for example by allowing functions from multiple namespaces to be used without a prefix provided the reference is unambiguous. I've also wondered about attaching special meaning to "." in a function name, so for example you can refer to the math functions as math.sin(x) without needing to bind a namespace prefix, and with some kind of mechanism for dropping the "math" prefix if it's not needed for disambiguation.
> 
> Michael Kay
> 
>> On 28 Nov 2020, at 00:48, Dimitre Novatchev <dnovatchev@gmail.com <mailto:dnovatchev@gmail.com>> wrote:
>> 
>> 1.  Summary
>> The problem of picking the best names and signatures for standard XPath functions is described.
>> 
>> A detailed solution is presented, that rests on fundamental XDM concepts, that is easy to implement, and has been implemented and used in practice.
>> 
>> 2.  The problem.
>> Here is what Michael Kay shared in the Xml.com chat <https://xmlcom.slack.com/archives/C011NLXE4DU/p1606514583065000?thread_ts=1606411737.480000&cid=C011NLXE4DU>:
>> 
>> “Naming of functions is a significant problem -- mainly because the namespace mechanism is grossly inadequate for the task of resolving ambiguous references; it's too "all-or-nothing". Yes, it's important that functions do what it says in the name (people misuse "contains" all the time because it's a poor choice of name). I'd love to find a good solution to this problem.”
>> 
>> The number of standard XPath 3.1 functions is 197. They reside in essentially a flat space and remembering even the existence of some functions, or inferring by its name what it does, has become very challenging and unfriendly.
>> 
>> Even when using namespaces for naming, we still have a huge number of functions in the same namespace. Namespaces only allow to have flat sets of functions. 
>> 
>> 
>> Let’s look at other programming languages. A typical example is C#. In C# a namespace can contain not only individual items, such as classes, interfaces, structures, delegates,… etc., but most importantly, a namespace can contain nested namespaces within itself. Thus the namespace mechanism in C# provides a hierarchical, not flat, structuring of object’s names.
>> 
>> It becomes possible to have to methods with the same name, here are a few examples:
>> 
>> Method
>> Class
>> Namespace
>> Remarks
>> Contains
>> string
>> System
>>  
>> Contains
>> List<T>
>> System.Collections.Generic
>>  
>> IndexOf
>> string
>> System
>>  
>> IndexOf
>> List<T>
>> System.Collections.Generic
>>  
>> IndexOf
>> Array
>> System
>>  
>> Etc …
>> …
>> …
>>  
>>  
>> This document contains a visual representation of the .NET 4.0 namespace hierarchy: http://download.microsoft.com/download/E/6/A/E6A8A715-7695-493C-8CFA-8E0C23A4BE1D/098-115952-NETFX4-Poster.pdf <http://download.microsoft.com/download/E/6/A/E6A8A715-7695-493C-8CFA-8E0C23A4BE1D/098-115952-NETFX4-Poster.pdf>
>> 
>> <image.png>
>> 
>> Unfortunately, XML namespaces were not designed with hierarchy support in mind.
>> 
>> Then what to do?
>> 
>> 3.  Proposed solution.
>> Starting with XPath 3.1 we have a powerful mechanism of creating and accessing hierarchy of functions.
>> 
>> To represent something corresponding to a .NET namespace one can use a map whose keys are either names of functions or names of other maps (corresponding to a class (group of related functions) or to a namespace that is contained in the current namespace)
>> 
>> Thus we can have several functions named Contains that reside in different maps.
>> 
>> With this mechanism we can construct a hierarchy of any desired level of detail for grouping just small sets of functions, so that there would be no name-conflicts within that group of functions. And we can have names like Contains, Add, Remove, Get, Replace etc. in as many such groups as necessary, without worrying about name conflicts, overloads, resolving ambiguous references, etc.
>> 
>> We can add to XPath and other languages (XSLT, XQuery) a mechanism for making available all functions and sub-maps in a given map.
>> 
>> For example in my work I have implemented an “extension function” loadFunctionLibraries, and here is a snapshot of its usage in real code:
>> 
>> 
>> (: ================ Include operators.xpath ========================:)  
>> let $ops := Q{http://www.fxpath.org <http://www.fxpath.org/>}loadFunctionLibraries#1(concat($pBaseFXpathUri, '/f/operators.xpath')),
>>                                                             
>> (: ================ Include folds.xpath ========================:)
>>   $folds := Q{http://www.fxpath.org <http://www.fxpath.org/>}loadFunctionLibraries#1(concat($pBaseFXpathUri, '/f/folds.xpath')), 
>> 
>>  (:    Special Folds
>>    ====================================================================:)
>>    $and := function ($booleans as xs:boolean*) as xs:boolean
>> {
>>   $folds?foldl_($ops?conjunction, true(), $booleans)
>> },
>> 
>> 
>> 
>> Here we see how the current XPath code loads the contents of two other XPath files: operators.xpath and folds.xpath.
>> 
>> If the loadFunctionLibraries loads an XPath file that also calls loadFunctionLibraries, it is also invoked and loads its dependencies, and so on…
>> 
>> As we see this is not a pure theoretical proposal. For some time I have been creating pure XPath functions, all grouped in maps, and I have never even thought about name conflicts, because no such problem exists between functions contained in different maps.
>> 
>>  
>> Conclusion:
>> The standard function library has surpassed a threshold where it has become challenging to name functions and give them the right signatures without conflict.
>> 
>> A solution to this problem was presented here, that would allow the same naming flexibility as that of modern programming languages with hierarchical namespaces.
>> 
>> 
>> HTH,
>> Dimitre
>> 
>> P.S.
>> 
>> If due to some reasons this message is not readable, please see the original .docx file attached.
>> <HierarchicalFunctionNamingProposal.docx>
>
Received on Saturday, 28 November 2020 11:40:50 UTC