SAG-FO-01: Too many functions (general comment)

Software AG believes that the size of the function library defined in the
Functions and Operators document is too large. 

The size of the library directly affects the cost of implementation, which
experience shows is the major factor determining the success or failure of a
proposed standard. It also affects the cost of learning to use the language,
which is another important predictor of success. 

It is clearly proving difficult for the working group to debug the
specification completely: reducing the number of functions will greatly
assist this task.

Between CR and final Recommendation the working groups will be required to
demonstrate interoperability of implementations. A large function library
greatly increases the difficulty of this task, and means it will take longer
to get to Recommendation status. Again, any further delays in getting the
specifications finished reduces the chance that they will succeed in the
market.

Giving users several ways to express the same thing makes it much more
difficult for optimizers to detect common coding patterns and optimize them.
A large function library therefore reduces the chance that common constructs
will be well optimized.

Software AG proposes a number of simplifications to the function library.
This message considers the general-purpose functions. A second message will
address the functions concerned with handling of dates, times, and
durations.

Proposal: remove the following functions

A1. fn:context-item() 

This is a trivial synonym for ".". No-one is likely to use it.

A2. fn:current-date() and fn:current-time()

These can be trivially derived from fn:current-dateTime()

A3. fn:deep-equal()

This is semantically complex and unlikely to meet everyone's needs for
comparing equality. Users can write their own version to match their own
needs. The use of deep-equal() in distinct-values() can be replaced by
making distinct-values a (higher-order) operator instead of a function:

    EXPR "with" "distinct" EXPR ("," EXPR )*

A4. fn:distinct-nodes()

This is likely to be needed very rarely and can be trivially written as $x/.

A5. fn:empty()

This is trivial syntactic sugar for count($x)=0, or not($x). The name is
also misleading, since other XML specifications use the word "empty" to mean
"having no children": users will make the mistake of thinking that empty(E)
is true if E is an element with no children.

A6. fn:ends-with()

The effect of this function can be readily achieved using the fn:matches()
function with a regular expression.

A7. fn:exists()

This is trivial syntactic sugar for count($x)!=0, or boolean($x).

A8. fn:index-of()

This is a rarely-needed function, and index-of($s, $v) can be readily
rewritten as

    for $i in 1 to count($s) return
      if ($s[$i] eq $v) then $i else ()

A9. fn:insert-before()

The most common requirement is to insert at the start or the end, which can
be achieved using the "," operator. The rare requirement to insert in the
middle can be achieved using

    ($target[position() lt $position], $inserts, $target[position() ge
$position])

A10. fn:item-at()

The effect of this function can be easily achieved using a numeric
predicate.

A11. fn:node-kind()

It is easy to test the kind of a node using "instance of" (or "type switch"
in XQuery)

A12. fn:remove()

The effect of this function can be readily achieved by writing
$target[position() ne $n]

A13. fn:sequence-node-identical()

This function is very rarely required; if it is ever needed, it can be
written as

    every $i in 1 to max(count($arg1), count($arg2)) 
        satisfies ($arg1[$i] is $arg2[$i])

A14. fn:string-pad()

This function can be readily written as

   string-join(for $i in (1 to $padCount) return $padString, "")

A15. fn:subsequence()

Both variants of this function can be readily written using positional
predicates:

   $sourceSeq[position() ge $startLoc]
   $sourceSeq[position() ge $startLoc and position() lt $startloc + $length]

Michael Kay
Software AG



 

Received on Tuesday, 10 June 2003 11:49:44 UTC