W3C home > Mailing lists > Public > public-forms@w3.org > September 2011

Re: Proposed note for insert and new functions for XForms 2.0

From: John Boyer <boyerj@ca.ibm.com>
Date: Fri, 30 Sep 2011 10:07:27 -0700
To: Nick Van den Bleeken <Nick.Van.den.Bleeken@inventivegroup.com>
Cc: "<public-forms@w3.org>"<public-forms@w3.org>
Message-ID: <OF087BD3E2.82F51792-ON8825791B.005A8575-8825791B.005E1143@ca.ibm.com>
Hi Nick,

This would at least help with getting the guarantees that would be useful 
to authors.
However, XForms is a consumer of the XPath technology, and XPath defines 
the ways in which it can be extended, namely by adding functions.  It is 
best to minimize the extent to which implementers have to hack the 
consumed XPath processor, particularly when their is a defined extension 
mechanism that can achieve the same effect.

As for xforms authors not having to worry about performance, they already 
do that when choosing the kinds of xforms constructs to use, not to 
mention the kinds of xpath expressions to use or to avoid. 

It's also interesting that you chose next-sibling::*[1] because I was 
originally thinking that I really wanted foo[2], foo[3], foo[last()-1], 
and foo[last()-2] to be constant time, or more generally that foo[k] and 
foo[last()-k] would be O(k).

So, in one processor someone would write foo[3] and get good performance, 
but in another they'd get poor performance unless they changed their form 
to use foo[1]/next-sibling::*[1]/next-sibling::*[1], at which point the 
form loses performance in the first processor.  Next up, I bet users would 
ask why foo[number(some/node)] doesn't achieve constant performance when 
some/node contains 1.

Given that functions are the defined extension point, it sure seemed like 
a good idea to use that mechanism because then you also have a single, 
documented way for an author to achieve the particular effect they want, 
if they want it.  The function gets the author to say exactly what they're 
doing so that we don't have to write an AI system to discover the endless 
sequence of similar xpath expression patterns that people might think of 
to do a specific thing. 

Put another way, if we're going to hack the xpath processor, why not say 
that *all* numeric predicates should get constant performance?  foo[n/2] 
is constant in every other language than xpath, so why not xpath too?  The 
answer, of course, is that you can't consistently get this performance 
across all xpath axes without blowing memory usage out to O(n^2), which of 
course takes that long to initialize. The egregious one is 
descendant::node()[n/2]. Getting constant time on that operation for all 
nodes is just not going to happen.

The functions also give authors something consistent to rely on. If you 
care about performance, use these functions.  If there's not a function, 
then there's not an optimized way of doing it.  From a language 
consistency standpoint, this seems better than suggesting "use a predicate 
and hope it is one of the ones that your chosen implementation optimized".

Cheers,
John




From:   Nick Van den Bleeken <Nick.Van.den.Bleeken@inventivegroup.com>
To:     John Boyer/CanWest/IBM@IBMCA
Cc:     "<public-forms@w3.org>" <public-forms@w3.org>
Date:   09/30/2011 05:58 AM
Subject:        Re: Proposed note for insert and new functions for XForms 
2.0
Sent by:        public-forms-request@w3.org



John,

I might be a bit biased because I use XPath on a day-to-day basis and 
really like it ;)

But I'm not really in favor of adding functions that get the evaluation 
speed you would expect (and love) and their dual native XPath expressions 
that behave poorly. If you are concerned that implementations will behave 
not optimal for foo[1], foo[last()], next-sibling::*[1] and 
previous-sibling::*[1] we could add language to the spec saying that an 
implementation 'should' evaluate expressions like those in a constant 
time. On the other hand I know that for example Saxon does a lot effort 
rewriting/simplifying XPath expressions when staticly analyzing the XPath 
expression that go way beyond the simple optimizations that you are 
requesting.

I won't like it that you have to know all 'special' XForms extension 
functions, and know when to use them, to get optimal performance. One of 
the nice things of XForms is, that  the author doesn't have to worry for 
performance, because the implementation optimizes the form (I have to 
admit that there are implementations that do a better job in optimizing 
forms then others, but due to a healthy competition others will follow 
automatically).

Kind regards,

Nick Van den Bleeken
R&D Manager

Phone: +32 3 821 01 70
Office fax: +32 3 821 01 71
nick.van.den.bleeken@inventivegroup.com
www.inventivedesigners.com




On 29 Sep 2011, at 22:01, John Boyer wrote:

Hi Nick, 

It is better to think about the problem from the viewpoint of the XForms 
author, not the XForms implementer. 

W3C usually doesn't discuss performance that much; it was once suggested 
to me that they view it as a market differentiator for implementers, but 
that's a frustrating approach for the consumer of the standards-based 
technology. 

>From the consumer's viewpoint, I'd like features in the language that I 
can directly express that are *guaranteed* to produce consistent results 
across implementations, where consistent means not just information 
results but also user experience. 

As an XForms author I can't rely on all implementers to make certain 
optimizations that are not called out in the specification.  I'd like to 
have language features that I can rely on, or at the very least that I 
SHOULD be able to rely on (in the RFC 2119 sense). 

If foo[1] is a constant time operation in my favorite processor, and a 
linear time operation in a number of other processors, then I might 
implement a feature like table sorting that works almost instantly in one 
processor but takes 5 seconds to complete in another.  The results of both 
processors are informationally the same (which is what the W3C almost 
exclusively focuses on), but they are not consistent from a user 
experience standpoint. 

On a more technical level, it seemed like you were OK with adding 
last-child(), as you didn't say "same question" on that one. Well, it 
turns out that I justified first-child() before last-child() because it 
seemed like the right order and made sense for talking about stacks, but 
the one I really need most for sorting is last-child(), and if you're up 
for that, then I ought to be able to get first-child() just out of a need 
for language consistency. So let me assume that you're also suggesting 
that my implementation should also optimize foo[last()] rather than adding 
last-child(). 

It turns out that some of the things I find myself trying to do, even just 
to do some kinds of sort methods, involve trying to get not just the first 
or last child, but the first or last two or three elements.  It becomes 
much easier to just give the XForms author a few extra efficient functions 
like next-sibling() and prev-sibling() rather than coming up with an 
increasingly long laundry list of specialized xpath patterns that they 
would have to memorize as the "safe" ones to use... in my implementation. 
You might optimize different expressions that do the same things... 

Anyway, on top of that, I suspect that a lot of really good uses of 
next-sibling() and prev-sibling will come when they are used in 
combination with XPath variables, which means characterizing optimizations 
involving across actions, and therefore across xpath expression 
evaluations.   

For want of five of the simplest functions imaginable, it seems a lot to 
ask XForms authors to hope that all implementers would write static 
analyzers that intuit what they're trying to do and then do it better than 
the native XPath engine is normally capable of.  It seems way better to 
extend the XPath engine with functions that all authors can then count on. 


Best regards, 
John M. Boyer, Ph.D.
Distinguished Engineer, IBM Forms and Smarter Web Applications
IBM Canada Software Lab, Victoria
E-Mail: boyerj@ca.ibm.com 

Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
Blog RSS feed: 
http://www.ibm.com/developerworks/blogs/rss/JohnBoyer?flavor=rssdw





From:        Nick Van den Bleeken <Nick.Van.den.Bleeken@inventivegroup.com
> 
To:        John Boyer/CanWest/IBM@IBMCA 
Cc:        "<public-forms@w3.org>" <public-forms@w3.org> 
Date:        09/29/2011 11:33 AM 
Subject:        Re: Proposed note for insert and new functions for XForms 
2.0 
Sent by:        public-forms-request@w3.org 



John, 

I included some remarks/questions inline. 

Kind regards,

Nick Van den Bleeken
R&D Manager

Phone: +32 3 821 01 70
Office fax: +32 3 821 01 71
nick.van.den.bleeken@inventivegroup.com
www.inventivedesigners.com


<Mail Attachment.png><Mail Attachment.png><Mail Attachment.png> 

On 28 Sep 2011, at 23:51, John Boyer wrote: 
RECOMMENDATION: Add a function with signature along the lines of "nodeset 
first-child(nodeset?, boolean?)" that returns the first child node of a 
given node, or the context node if no first parameter.  The function 
SHOULD achieve O(1) on the operation.  By default ignore whitespace text 
nodes, but if second parameter given and true, then include them in 
result. 

With this function implemented efficiently (and the efficient insert 
above), N stack operations will take O(N) processing time, and without it, 
the result is O(N^2). 
Why couldn't you just rewrite foo[1] to ibm:first-child(foo) during the 
static optimization phase when parsing the XPath expression (with 
ibm:first-child your proprietary extension function)? I think/hope that 
your XPath engine already does a lot other static optimizations when 
'compiling/analizing' the XPath expression. 

Efficiently implementing queue-based algorithms requires the ability to 
push on one side of a list and pop from the other.   
Given the ability described above to efficiently insert at the beginning 
of the list, it seems we only need a way to efficiently identify the last 
child of a node in constant time. 

RECOMMENDATION: Add a function with signature along the lines of "nodeset 
last-child(nodeset?, boolean?)" that returns the last child node of a 
given node, or the context node if no first parameter.  The function 
SHOULD achieve O(1) on the operation. By default ignore whitespace text 
nodes, but if second parameter given and true, then include them in 
result. 

Incidentally, the last-child() function also allows efficient insertion at 
the end of a list of children of a node by using it in the insert nodeset. 

This is helpful in creating an efficient sort that constructs its result 
in least to greatest order. 
The efficient first-child() and last-child() functions also appear to 
enable optimal O(N log N) sorting via XForms action loops, when used in 
combination with the iterate attribute.  Without having at least 
first-child(), the best sort performance appears to be O(N log N log log 
N). 

A number of more useful data structures become possible to implement if it 
is also possible to obtain the parent, next sibling, and previous sibling 
of a node in constant time.  These functions would prove convenient in 
implementing some sorting methods in the most natural ways, but efficient 
versions of several comparison-based data structures (e.g. various 
priority queues, trees and graphs) appear to become possible as well. 
Same remark/question why don't do this during the static optimization 
phase when parsing the XPath expression. 


RECOMMENDATION: Add functions with signatures along the lines of 
"nodeset parent(nodeset?)" returns the parent of a given node, or the 
context node if no first parameter. 
"nodeset next-sibling(nodeset?, boolean?)" returns the next sibling of a 
given node, or the context node if no first parameter. 
"nodeset prev-sibling(nodeset?, boolean?)" returns the previous sibling of 
a given node, or the context node if no first parameter. 

These functions SHOULD achieve O(1) on the operation. For latter two 
functions, ignore whitespace text nodes by default, but if second 
parameter given and true, then include them in result. 

For all of the functions, clearly if the first, last, next, previous or 
parent node doesn't exist, then an empty nodeset is returned. 

Finally, the eval() function is already on the list for XForms 2.0 with 
the use case from Michael Sperberg-McQueen of being able to use a repeat 
to drill into and edit arbitrarily structured XML.  A second use case for 
the function pertains to algorithms like sorting as well as 
comparison-based data structure management.  Given an eval() function, it 
would be possible to allow a form author to provide the comparison 
expression as a string so that the sort or comparison data structure could 
compare strings, numbers, dates or even secondary keys in case of primary 
key equality.  This would help separate the algorithm or data structure 
from the application details. 
Same remark/question why don't do this during the static optimization 
phase when parsing the XPath expression. 


RECOMMENDATION: Add function with signature along lines of "object 
eval(string, nodeset?, number?)" that evaluates an xpath given in the 
string using the evaluation context of the eval() function, or in the 
context of a nodeset and a node at a given position, if those params are 
specified. 

definitely +1 for this one there is already a wiki page for this one 
http://www.w3.org/MarkUp/Forms/wiki/Eval_function 



Inventive Designers' Email Disclaimer:
http://www.inventivedesigners.com/email-disclaimer 

-- 
This message has been scanned for viruses and 
dangerous content by MailScanner, and is 
believed to be clean. 



Inventive Designers' Email Disclaimer:
http://www.inventivedesigners.com/email-disclaimer


picture
(image/png attachment: 01-part)

picture
(image/png attachment: 02-part)

picture
(image/png attachment: 03-part)

Received on Friday, 30 September 2011 17:07:59 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 October 2013 22:06:55 UTC