Re: Function naming: Problems and proposed solution from Dimitre D.Novatchev on 2020-11-28 (public-xslt-40@w3.org from November 2020)

From: Dimitre D.Novatchev <dnovatchev2@gmail.com>
Date: Sat, 28 Nov 2020 11:14:03 -0800
To: Michael Kay <mike@saxonica.com>
Cc: Dimitre Novatchev <dnovatchev@gmail.com>, public-xslt-40@w3.org
Message-ID: <CAN0P7_OKrfEEqdJ9=-OqzPwht23=QP0CeLKrB8odSPme9rJSJw@mail.gmail.com>
On Sat, Nov 28, 2020 at 3:16 AM Michael Kay <mike@saxonica.com> wrote:



> Tuple types - or Record types as I'm now proposing to call them -
possibly provide a way forward on this. For example the math functions
could be declared as an instance of a record type

>

> record ( pi as function() as xs:double,

>             sin as function(xs:double) as xs:double,

>               ... etc )



Thank you Dr. Kay for your positive response.



> and if $math is statically known to conform to this type, then $math?sin
is statically known
> to be a function with a particular signature. (But note, it's $math?sin,
not $math?sin#1).



*It is amusing how easy this is to do*:



Let's have a map $Strings and one of its members is the map $replace:

let $replace := map {
            '_3' : replace#3,
            '_4' : replace#4
            }
return
   $replace?_3 ( 'abbbcdbbbba', 'b', 'B')



produces exactly the wanted result:



aBBBcdBBBBa





We certainly can add a little bit of language support, for example to allow
the second argument of ? to accept the # character. Then we could have even
this:





let $replace := map {
            '#3' : replace#3,
            '#4' : replace#4
            }
return
   $replace?#3 ( 'abbbcdbbbba', 'b', 'B')



We can go even a little bit further and specify that the way to evaluate
this expression:


  $replace('abbbcdbbbba', 'b', 'B')



as:



"*When a map contains just several overloads of a function (such as '**#3**',
'**#4*
*') and is called as a function without specifying arity, then the arity of
the actual call is used to determine the member of the map that will be the
actual function to be invoked*."



In the above case the arity used in the function invocation is 3, thus
replace#3  will be invoked and we get the same correct result:



  aBBBcdBBBBa



>  it's a bit hard to see how to introduce it alongside existing mechanism
without ending up with

> a rather clumsy mixture of different approaches coexisting



Grouping functions within maps is very flexible. A function can be
referenced by a map, and in the same time it can be referenced directly
(globally)
as in all versions of XPath up to 3.1.

Thus there is no backwards compatibility problem.



Versions of XPath > 3.1 will provide the   $fn:Functions   map which
represents a hierarchy (actually a forest) that is a grouping of all
functions provided in the
F & O  standard. Starting with the next version of XPath, people will use
this "SysMap" as it is easy and intuitive to learn and navigate (a semantic
hierarchy)
and minimizes the possibility of name and signature conflicts.

So there is not going to be a "clumsy mixture" because there is not going
to be a mixture at all -- code written in the older versions uses the
global referencing,
code written with the new versions relies heavily on the convenience of the
SysMap functions grouping.





I would like to thank everyone for their support and would appreciate your
further comments and suggestions.



Thanks,

Dimitre



P.S. Again, if this is difficult to read when browsing the W3 mailing list,
please use the attached .pdf file,
or download it from:
https://github.com/dnovatchev/FXSL-XSLT2/raw/master/HierarchicalFunctionNamingProposal-2.pdf



I think there are certainly cases where this approach could be useful. And
> it ties in with what load-xquery-modules() does. The main drawback is that
> without some language support, there's no static checking of the function
> names or of their signatures.
>
> Tuple types - or Record types as I'm now proposing to call them - possibly
> provide a way forward on this. For example the math functions could be
> declared as an instance of a record type
>
> record ( pi as function() as xs:double,
>               sin as function(xs:double) as xs:double,
>               ... etc )
>
> and if $math is statically known to conform to this type, then $math?sin
> is statically known to be a function with a particular signature. (But
> note, it's $math?sin, not $math?sin#1).
>
> At present I'm not sure that this gives much benefit over using multiple
> namespaces, and it's a bit hard to see how to introduce it alongside
> existing mechanism without ending up with a rather clumsy mixture of
> different approaches coexisting.
>
> I've been trying to find a way that still uses namespaces, but makes them
> more flexible. For example, by separating the namespace context for
> functions from that for elements, and/or making the binding of prefixes to
> namespaces more adaptable - for example by allowing functions from multiple
> namespaces to be used without a prefix provided the reference is
> unambiguous. I've also wondered about attaching special meaning to "." in a
> function name, so for example you can refer to the math functions as
> math.sin(x) without needing to bind a namespace prefix, and with some kind
> of mechanism for dropping the "math" prefix if it's not needed for
> disambiguation.
>
> Michael Kay
>
> On 28 Nov 2020, at 00:48, Dimitre Novatchev <dnovatchev@gmail.com> wrote:
>
> *1.  **Summary*
>
> The problem of picking the best names and signatures for standard XPath
> functions is described.
>
> A detailed solution is presented, that rests on fundamental XDM concepts,
> that is easy to implement, and has been implemented and used in practice.
> *2.  **The problem.*
>
> Here is what Michael Kay shared in the *Xml.com chat
> <https://xmlcom.slack.com/archives/C011NLXE4DU/p1606514583065000?thread_ts=1606411737.480000&cid=C011NLXE4DU>*:
>
>
> “Naming of functions is a significant problem -- mainly because the
> namespace mechanism is grossly inadequate for the task of resolving
> ambiguous references; it's too "all-or-nothing". Yes, it's important that
> functions do what it says in the name (people misuse "contains" all the
> time because it's a poor choice of name). I'd love to find a good solution
> to this problem.”
>
> The number of standard XPath 3.1 functions is 197. They reside in
> essentially a flat space and remembering even the existence of some
> functions, or inferring by its name what it does, has become very
> challenging and unfriendly.
>
> Even when using namespaces for naming, we still have a huge number of
> functions in the same namespace. Namespaces only allow to have flat sets of
> functions.
>
> Let’s look at other programming languages. A typical example is C#. In C#
> a namespace can contain not only individual items, such as classes,
> interfaces, structures, delegates,… etc., but most importantly, a namespace
> can contain nested namespaces within itself. Thus the namespace mechanism
> in C# provides a hierarchical, not flat, structuring of object’s names.
>
> It becomes possible to have to methods with the same name, here are a few
> examples:
> Method
> Class
> Namespace
> Remarks
> Contains
> string
> System
>
>
> Contains
> List<T>
> System.Collections.Generic
>
>
> IndexOf
> string
> System
>
>
> IndexOf
> List<T>
> System.Collections.Generic
>
>
> IndexOf
> Array
> System
>
>
> Etc …
> …
> …
>
>
>
>
> This document contains a visual representation of the .NET 4.0 namespace
> hierarchy:
> http://download.microsoft.com/download/E/6/A/E6A8A715-7695-493C-8CFA-8E0C23A4BE1D/098-115952-NETFX4-Poster.pdf
>
>
> <image.png>
>
>
> Unfortunately, XML namespaces were not designed with hierarchy support in
> mind.
>
> Then what to do?
> *3.  **Proposed solution.*
>
> Starting with XPath 3.1 we have a powerful mechanism of creating and
> accessing hierarchy of functions.
>
> To represent something corresponding to a .NET namespace one can use a map
> whose keys are either names of functions or names of other maps
> (corresponding to a class (group of related functions) or to a namespace
> that is contained in the current namespace)
>
> Thus we can have several functions named Contains that reside in
> different maps.
>
> With this mechanism we can construct a hierarchy of any desired level of
> detail for grouping just small sets of functions, so that there would be no
> name-conflicts within that group of functions. And we can have names like
> Contains, Add, Remove, Get, Replace etc. in as many such groups as
> necessary, without worrying about name conflicts, overloads, resolving
> ambiguous references, etc.
>
> We can add to XPath and other languages (XSLT, XQuery) a mechanism for
> making available all functions and sub-maps in a given map.
>
> For example in my work I have implemented an “extension function”
> loadFunctionLibraries, and here is a snapshot of its usage in real code:
>
> (: ================ Include operators.xpath ========================:)
> let *$ops* := *Q{http://www.fxpath.org
> <http://www.fxpath.org/>}loadFunctionLibraries#1*(*concat*(
> *$pBaseFXpathUri*, '/f/operators.xpath')),
>
> (: ================ Include folds.xpath ========================:)
>   *$folds* := *Q{http://www.fxpath.org <http://www.fxpath.org/>}*
> loadFunctionLibraries#1(*concat*(*$pBaseFXpathUri*, '/f/folds.xpath')),
>
>  (:    Special Folds
>    ====================================================================:)
>    *$and* := *function* (*$booleans* *as* *xs:boolean**) *as* *xs:boolean*
> {
>   *$folds*?*foldl_*(*$ops*?*conjunction*, *true*(), *$booleans*)
> },
>
>
> Here we see how the current XPath code loads the contents of two other
> XPath files: operators.xpath and folds.xpath.
>
> If the loadFunctionLibraries loads an XPath file that also calls
> loadFunctionLibraries, it is also invoked and loads its dependencies, and
> so on…
>
> As we see this is not a pure theoretical proposal. For some time I have
> been creating pure XPath functions, all grouped in maps, and I have never
> even thought about name conflicts, because no such problem exists between
> functions contained in different maps.
>
>
> *Conclusion*:
> The standard function library has surpassed a threshold where it has
> become challenging to name functions and give them the right signatures
> without conflict.
>
> A solution to this problem was presented here, that would allow the same
> naming flexibility as that of modern programming languages with
> hierarchical namespaces.
>
> HTH,
> Dimitre
>
> P.S.
>
> If due to some reasons this message is not readable, please see the
> original .docx file attached.
> <HierarchicalFunctionNamingProposal.docx>
>
>
>
Attachments

application/pdf attachment: HierarchicalFunctionNamingProposal-2.pdf
Received on Saturday, 28 November 2020 19:15:39 UTC