Re: Proposed note for insert and new functions for XForms 2.0 from Nick Van den Bleeken on 2011-09-30 (public-forms@w3.org from September 2011)

From: Nick Van den Bleeken <Nick.Van.den.Bleeken@inventivegroup.com>
Date: Fri, 30 Sep 2011 12:56:52 +0000
To: John Boyer <boyerj@ca.ibm.com>
CC: "<public-forms@w3.org>" <public-forms@w3.org>
Message-ID: <DAFB3819-16A2-4D4F-B994-5AD9417A176A@inventivegroup.com>
John,

I might be a bit biased because I use XPath on a day-to-day basis and really like it ;)

But I'm not really in favor of adding functions that get the evaluation speed you would expect (and love) and their dual native XPath expressions that behave poorly. If you are concerned that implementations will behave not optimal for foo[1], foo[last()], next-sibling::*[1] and previous-sibling::*[1] we could add language to the spec saying that an implementation 'should' evaluate expressions like those in a constant time. On the other hand I know that for example Saxon does a lot effort rewriting/simplifying XPath expressions when staticly analyzing the XPath expression that go way beyond the simple optimizations that you are requesting.

I won't like it that you have to know all 'special' XForms extension functions, and know when to use them, to get optimal performance. One of the nice things of XForms is, that  the author doesn't have to worry for performance, because the implementation optimizes the form (I have to admit that there are implementations that do a better job in optimizing forms then others, but due to a healthy competition others will follow automatically).

Kind regards,

Nick Van den Bleeken
R&D Manager

Phone: +32 3 821 01 70
Office fax: +32 3 821 01 71
nick.van.den.bleeken@inventivegroup.com<mailto:nick.van.den.bleeken@inventivegroup.com>
www.inventivedesigners.com


[cid:image001.png@01CBF2F8.1DA19110][cid:image002.png@01CBF2F8.1DA19110][cid:image003.png@01CBF2F8.1DA19110]

On 29 Sep 2011, at 22:01, John Boyer wrote:

Hi Nick,

It is better to think about the problem from the viewpoint of the XForms author, not the XForms implementer.

W3C usually doesn't discuss performance that much; it was once suggested to me that they view it as a market differentiator for implementers, but that's a frustrating approach for the consumer of the standards-based technology.

>From the consumer's viewpoint, I'd like features in the language that I can directly express that are *guaranteed* to produce consistent results across implementations, where consistent means not just information results but also user experience.

As an XForms author I can't rely on all implementers to make certain optimizations that are not called out in the specification.  I'd like to have language features that I can rely on, or at the very least that I SHOULD be able to rely on (in the RFC 2119 sense).

If foo[1] is a constant time operation in my favorite processor, and a linear time operation in a number of other processors, then I might implement a feature like table sorting that works almost instantly in one processor but takes 5 seconds to complete in another.  The results of both processors are informationally the same (which is what the W3C almost exclusively focuses on), but they are not consistent from a user experience standpoint.

On a more technical level, it seemed like you were OK with adding last-child(), as you didn't say "same question" on that one. Well, it turns out that I justified first-child() before last-child() because it seemed like the right order and made sense for talking about stacks, but the one I really need most for sorting is last-child(), and if you're up for that, then I ought to be able to get first-child() just out of a need for language consistency. So let me assume that you're also suggesting that my implementation should also optimize foo[last()] rather than adding last-child().

It turns out that some of the things I find myself trying to do, even just to do some kinds of sort methods, involve trying to get not just the first or last child, but the first or last two or three elements.  It becomes much easier to just give the XForms author a few extra efficient functions like next-sibling() and prev-sibling() rather than coming up with an increasingly long laundry list of specialized xpath patterns that they would have to memorize as the "safe" ones to use... in my implementation.  You might optimize different expressions that do the same things...

Anyway, on top of that, I suspect that a lot of really good uses of next-sibling() and prev-sibling will come when they are used in combination with XPath variables, which means characterizing optimizations involving across actions, and therefore across xpath expression evaluations.

For want of five of the simplest functions imaginable, it seems a lot to ask XForms authors to hope that all implementers would write static analyzers that intuit what they're trying to do and then do it better than the native XPath engine is normally capable of.  It seems way better to extend the XPath engine with functions that all authors can then count on.

Best regards,
John M. Boyer, Ph.D.
Distinguished Engineer, IBM Forms and Smarter Web Applications
IBM Canada Software Lab, Victoria
E-Mail: boyerj@ca.ibm.com<mailto:boyerj@ca.ibm.com>

Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
Blog RSS feed: http://www.ibm.com/developerworks/blogs/rss/JohnBoyer?flavor=rssdw





From:        Nick Van den Bleeken <Nick.Van.den.Bleeken@inventivegroup.com<mailto:Nick.Van.den.Bleeken@inventivegroup.com>>
To:        John Boyer/CanWest/IBM@IBMCA
Cc:        "<public-forms@w3.org<mailto:public-forms@w3.org>>" <public-forms@w3.org<mailto:public-forms@w3.org>>
Date:        09/29/2011 11:33 AM
Subject:        Re: Proposed note for insert and new functions for XForms 2.0
Sent by:        public-forms-request@w3.org<mailto:public-forms-request@w3.org>
________________________________



John,

I included some remarks/questions inline.

Kind regards,

Nick Van den Bleeken
R&D Manager

Phone: +32 3 821 01 70
Office fax: +32 3 821 01 71
nick.van.den.bleeken@inventivegroup.com<mailto:nick.van.den.bleeken@inventivegroup.com>
www.inventivedesigners.com


<Mail Attachment.png><Mail Attachment.png><Mail Attachment.png>

On 28 Sep 2011, at 23:51, John Boyer wrote:
RECOMMENDATION: Add a function with signature along the lines of "nodeset first-child(nodeset?, boolean?)" that returns the first child node of a given node, or the context node if no first parameter.  The function SHOULD achieve O(1) on the operation.  By default ignore whitespace text nodes, but if second parameter given and true, then include them in result.

With this function implemented efficiently (and the efficient insert above), N stack operations will take O(N) processing time, and without it, the result is O(N^2).
Why couldn't you just rewrite foo[1] to ibm:first-child(foo) during the static optimization phase when parsing the XPath expression (with ibm:first-child your proprietary extension function)? I think/hope that your XPath engine already does a lot other static optimizations when 'compiling/analizing' the XPath expression.

Efficiently implementing queue-based algorithms requires the ability to push on one side of a list and pop from the other.
Given the ability described above to efficiently insert at the beginning of the list, it seems we only need a way to efficiently identify the last child of a node in constant time.

RECOMMENDATION: Add a function with signature along the lines of "nodeset last-child(nodeset?, boolean?)" that returns the last child node of a given node, or the context node if no first parameter.  The function SHOULD achieve O(1) on the operation. By default ignore whitespace text nodes, but if second parameter given and true, then include them in result.

Incidentally, the last-child() function also allows efficient insertion at the end of a list of children of a node by using it in the insert nodeset.
This is helpful in creating an efficient sort that constructs its result in least to greatest order.
The efficient first-child() and last-child() functions also appear to enable optimal O(N log N) sorting via XForms action loops, when used in combination with the iterate attribute.  Without having at least first-child(), the best sort performance appears to be O(N log N log log N).

A number of more useful data structures become possible to implement if it is also possible to obtain the parent, next sibling, and previous sibling of a node in constant time.  These functions would prove convenient in implementing some sorting methods in the most natural ways, but efficient versions of several comparison-based data structures (e.g. various priority queues, trees and graphs) appear to become possible as well.
Same remark/question why don't do this during the static optimization phase when parsing the XPath expression.


RECOMMENDATION: Add functions with signatures along the lines of
"nodeset parent(nodeset?)" returns the parent of a given node, or the context node if no first parameter.
"nodeset next-sibling(nodeset?, boolean?)" returns the next sibling of a given node, or the context node if no first parameter.
"nodeset prev-sibling(nodeset?, boolean?)" returns the previous sibling of a given node, or the context node if no first parameter.

These functions SHOULD achieve O(1) on the operation. For latter two functions, ignore whitespace text nodes by default, but if second parameter given and true, then include them in result.

For all of the functions, clearly if the first, last, next, previous or parent node doesn't exist, then an empty nodeset is returned.

Finally, the eval() function is already on the list for XForms 2.0 with the use case from Michael Sperberg-McQueen of being able to use a repeat to drill into and edit arbitrarily structured XML.  A second use case for the function pertains to algorithms like sorting as well as comparison-based data structure management.  Given an eval() function, it would be possible to allow a form author to provide the comparison expression as a string so that the sort or comparison data structure could compare strings, numbers, dates or even secondary keys in case of primary key equality.  This would help separate the algorithm or data structure from the application details.
Same remark/question why don't do this during the static optimization phase when parsing the XPath expression.


RECOMMENDATION: Add function with signature along lines of "object eval(string, nodeset?, number?)" that evaluates an xpath given in the string using the evaluation context of the eval() function, or in the context of a nodeset and a node at a given position, if those params are specified.

definitely +1 for this one there is already a wiki page for this one http://www.w3.org/MarkUp/Forms/wiki/Eval_function


________________________________

Inventive Designers' Email Disclaimer:
http://www.inventivedesigners.com/email-disclaimer

--
This message has been scanned for viruses and
dangerous content by MailScanner<http://www.mailscanner.info/>, and is
believed to be clean.


________________________________

Inventive Designers' Email Disclaimer:
http://www.inventivedesigners.com/email-disclaimer
Attachments

image/png attachment: image001.png
image/png attachment: image002.png
image/png attachment: image003.png
Received on Friday, 30 September 2011 12:57:19 UTC