Proposed note for insert and new functions for XForms 2.0

Dear XForms working group,

In the past, it has been claimed that the 'iterate' attribute on an XForms 
action is desirable because it is simpler and easier to understand than 
using 'while' on for-each style iteration.
This is true, but it is not the full extent of its value to XForms 
actions. 
XPath predicates are able to look and behave like array indexes, but they 
are not as performant.  If there are N elements e, then E[1] is an O(N) 
expression.  As a result, a simple forward iteration through a nodeset 
using 'while' is an O(N^2) operation, whereas it is only O(N) using 
iterate.

This is helpful, but it quickly becomes pointless when trying even a 
relatively trivial mutation algorithm, like a simple selection sort (or 
the far less efficient and much reviled bubble sort).

The insert action has an important special behavior if you provide a 
context attribute and no nodeset (or bind) attribute.  It will insert the 
origin node as the first child of the context node.
This behavior was deliberately designed to maximize the likelihood that an 
implementation could perform the insertion in constant time.

RECOMMENDATION:  Add a note indicating that implementations SHOULD achieve 
O(1) performance on this operation.

Given this recommendation, it is possible to sort in O(N^2) time rather 
than a dismal O(N^3) time.

Currently, the delete action can directly delete an indicated node, but 
there is no way to indicate the first child of a node in constant time.
Also, there is no way to efficiently indicate in constant time the first 
child node in order to use it as the origin of an insert or to simply 
obtain a value from it.
A stack push operation correspond to the insert action above, and a stack 
pop operation requires an insert/delete to move the first element to a new 
location outside the stack, where it can be efficiently queried. 

RECOMMENDATION: Add a function with signature along the lines of "nodeset 
first-child(nodeset?, boolean?)" that returns the first child node of a 
given node, or the context node if no first parameter.  The function 
SHOULD achieve O(1) on the operation.  By default ignore whitespace text 
nodes, but if second parameter given and true, then include them in 
result.

With this function implemented efficiently (and the efficient insert 
above), N stack operations will take O(N) processing time, and without it, 
the result is O(N^2).

Efficiently implementing queue-based algorithms requires the ability to 
push on one side of a list and pop from the other. 
Given the ability described above to efficiently insert at the beginning 
of the list, it seems we only need a way to efficiently identify the last 
child of a node in constant time.

RECOMMENDATION: Add a function with signature along the lines of "nodeset 
last-child(nodeset?, boolean?)" that returns the last child node of a 
given node, or the context node if no first parameter.  The function 
SHOULD achieve O(1) on the operation. By default ignore whitespace text 
nodes, but if second parameter given and true, then include them in 
result.

Incidentally, the last-child() function also allows efficient insertion at 
the end of a list of children of a node by using it in the insert nodeset.
This is helpful in creating an efficient sort that constructs its result 
in least to greatest order.
The efficient first-child() and last-child() functions also appear to 
enable optimal O(N log N) sorting via XForms action loops, when used in 
combination with the iterate attribute.  Without having at least 
first-child(), the best sort performance appears to be O(N log N log log 
N).

A number of more useful data structures become possible to implement if it 
is also possible to obtain the parent, next sibling, and previous sibling 
of a node in constant time.  These functions would prove convenient in 
implementing some sorting methods in the most natural ways, but efficient 
versions of several comparison-based data structures (e.g. various 
priority queues, trees and graphs) appear to become possible as well.

RECOMMENDATION: Add functions with signatures along the lines of 
"nodeset parent(nodeset?)" returns the parent of a given node, or the 
context node if no first parameter.
"nodeset next-sibling(nodeset?, boolean?)" returns the next sibling of a 
given node, or the context node if no first parameter.
"nodeset prev-sibling(nodeset?, boolean?)" returns the previous sibling of 
a given node, or the context node if no first parameter.

These functions SHOULD achieve O(1) on the operation. For latter two 
functions, ignore whitespace text nodes by default, but if second 
parameter given and true, then include them in result.

For all of the functions, clearly if the first, last, next, previous or 
parent node doesn't exist, then an empty nodeset is returned.

Finally, the eval() function is already on the list for XForms 2.0 with 
the use case from Michael Sperberg-McQueen of being able to use a repeat 
to drill into and edit arbitrarily structured XML.  A second use case for 
the function pertains to algorithms like sorting as well as 
comparison-based data structure management.  Given an eval() function, it 
would be possible to allow a form author to provide the comparison 
expression as a string so that the sort or comparison data structure could 
compare strings, numbers, dates or even secondary keys in case of primary 
key equality.  This would help separate the algorithm or data structure 
from the application details.

RECOMMENDATION: Add function with signature along lines of "object 
eval(string, nodeset?, number?)" that evaluates an xpath given in the 
string using the evaluation context of the eval() function, or in the 
context of a nodeset and a node at a given position, if those params are 
specified.

These functions look very easy to implement at the optimal level, and add 
a lot of power to XForms actions, so I'm looking at them as extension 
functions for now, and I'd be happy to do the spec work for XForms 2.0 if 
these functions are acceptable to the group.

Thanks,
John M. Boyer, Ph.D.
Distinguished Engineer, IBM Forms and Smarter Web Applications
IBM Canada Software Lab, Victoria
E-Mail: boyerj@ca.ibm.com 

Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
Blog RSS feed: 
http://www.ibm.com/developerworks/blogs/rss/JohnBoyer?flavor=rssdw

Received on Wednesday, 28 September 2011 21:51:52 UTC