W3C Forms teleconference January 12, 2011

* Present

Nick van den Bleeken, Inventive Designers
Steven Pemberton, CWI/W3C (chair)
Leigh Klotz, Xerox (minutes)
John Boyer, IBM
Uli Lissé, DreamLabs
Erik Bruchez, Orbeon

* Agenda

http://lists.w3.org/Archives/Public/public-forms/2011Jan/0006.html

Previous Minutes
select1 and empty items proposed erratum
XPath 2.0
IRC Minutes
Meeting Ends

* Previous Minutes

* select1 and empty items proposed erratum

http://lists.w3.org/Archives/Public/public-forms/2011Jan/0004.html http://lists.w3.org/Archives/Public/public-forms/2011Jan/0005.html

Steven Pemberton: I agree with John's wording, except for the double full-stop at the end.

Resolution 2011-01-12.1: We accept the select1 and empty items erratum in http://lists.w3.org/Archives/Public/public-forms/2011Jan/0004.html

ACTION-1762 - John Boyer to effect XForms 1.1 erratum http://lists.w3.org/Archives/Public/public-forms/2011Jan/0004.html

* XPath 2.0

http://www.w3.org/MarkUp/Forms/wiki/XPath_2.0

Nick van: The XPath 2.0 work depends on other XForms 1.2 features

@select -- you can have a sequence of atomic values
MIPS -- multiple MIP bindings to the same node

Nick van: We resolved to have xpath-version. I also took the same wording for the xforms version. All specified @xpath-version must be the same.
Nick van: In XForms Function Library, I dropped if() and used more specific return types (e.g. dateTime). We resolved to put the XForms functions in the XForms namespace, unless there's a good reason, for example to implement a backward-compatible mode unprefixed.
John Boyer: Do we have to get rid of if?
Leigh Klotz: What's the difference between if() and choose()?
Nick van: Our if() function always returns a string. Then there's an if construct.
John Boyer: So there's no definition in XForms 1.2? Or only in XPath 1.0?
Nick van: Nothing chnages in XPath 1.0.
John Boyer: The table says XForms 1.2 definitions; it should be updated.
Steven Pemberton: So the columns and XPath 1.0 and XPath 2.0.
John Boyer: XPath 2.0 contains a reserved keyword "if" so you can't name a function that. It's a bigger problem than we think. A function in XPath evaluates all its parameters. So when you try to collect references that an XPath expression makes, you get the if() references, but with XPath 2.0, the if statement only takes one path. Later on if something changes the evaluation of the if, and nodes in the else path start to change, without doing a rebuild, you don't detect it.
Nick van: If the first argument doesn't change, then you only touch the second or the third argument. So we know there is a change.
John Boyer: There won't be a new rebuild.
Nick van: I was wondering if we needed to do someting special for the for() construct, but I don't think so, becuase you'll evaluate the complete expression. We could add dependencies to both.
John Boyer: It's challenging to add dependencies to both if you don't run both sides.
Nick van: You could add all three dependencies.
John Boyer: That would be great; I have one of my guys close to sending email to the group at this point. Even in XPath 1.0 we have a version of this problem because the default evaluation for boolean "and" and "or" does short-circuiting. Most implementations of XPath allow you to shut off short-circuiting, and we do that because we want to capture references from both sides of an end.
Leigh Klotz: What's the semantic problem? You might just re-evaluate when you don't have to.
John Boyer: If you use an if in a calculate, and it's the if function you have no problem, but with the if construct, you do. You go down only the consequent or the else part. We collect references based on that expression. If you don't evaluate the else part, then we don't capture any node references.
Leigh Klotz: Why not?
John Boyer: Because the spec says references are only collected based on the nodes matched during an evaluation, and the else part isn't evaluated. We also think the spec is broken, and that's what the email is about, but we say a very specific way of calculating the dependencies and capturing the list of nodes that are referenced.
Leigh Klotz: So we say evaluate rather than partial evaluate.
Erik Bruchez: We agreed should relax this before.
John Boyer: We did have a telecon about this. We have produced the compelling example now; there's also the boolean short-circuit issue and other issues and I want to separate those out.
Erik Bruchez: So the good news is that it's not an XPath 2.0 problem, but an existing problem. Our implementation doesn't do that becuase we didn't implement the dependency system. We're implementing one based on static analysis. It's not equivalent and doesn't solve the same type of problems, but it does improve performance.
Erik Bruchez: We would be happy if the spec supported that type of implementation instead of forcing a specific algorithm we have now.
John Boyer: Yep.
Erik Bruchez: So anyway we have this problem which is separate from XPath 2.0. There might be some other construct as well.
John Boyer: The loop. If you do something which changes the loop count you get more node dependencies.
Erik Bruchez: Intersection, if the left-hand is empty. So short-circuiting in general.
John Boyer: The issue is, even with XPath location steps with predicates, an earlier location step can change the later steps. It's hopeless: at some point we're down to the fact that the number of XPath references is dynamic, and that's why we ended up with rebuild. So with if() vs if...then that is the kind of thing that will change.

Nick van: There are a couple of functions with dependency problems.
John Boyer: XPath 2.0 has more dynamic dependency issues, but we've always had them. The most we can do is identify them and tell people when to do a rebuild.
Erik Bruchez: It's a very interesting topic, but separate from the upgrade XPath 2.0.
Nick van: deep-equal introduces the need for a rebuild.
Erik Bruchez: We have this same general problem in XPath 1.0. It's hard for a grammar to figure out when it's going to work. With the spec implementation, we find manual rebuild/recalculate/revalidate is sometimes necessary, and it's frustrating because you're never sure when. You have an expectation of magic dependencies, but in practice you hit cases where it doesn't work and it's hard to figure out. That's due to the loose nature of the dependency graph, because it's hard to figure out when. I don't know if we have a solution.
Steven Pemberton: I link this problem with the structural dependencies problem; the reason it's surprising for programmers is because we do so many of the constraints so well, that when it doesn't work, you're rather shocked. My personal view is that a future XForms should be able to do this. The constraints are there and visible and it should be possible to work it out without rebuild.
Erik Bruchez: I agree. We chose our approach differently. We have expressions in bind and MIPs and the only thing we care about is reducing the number of XPath re-evaluations. If you re-evaluate everything, it's not a problem, but your implementation can be smarter. You can achieve a lot of that by looking at the XPath expressions statically for value- and structural-changes. That allows automatic rebuild skip. We entirely skip some recalculations and revalidations. If you specify it as a performance thing only, instead of a spreadsheet-like feature, you can reduce the number of surprises.
John Boyer: We moved the recalculation engine to the normative appendex; it says it should work out the same way, because we wanted to have enough advice about how to implement this in at least one way.
Leigh Klotz: I think Erik's talking about dropping the requirement for bind/@calculate other than in document order.
John Boyer: You mean re-run until nothing changes.
Erik Bruchez: No, run them once.
John Boyer: Then you have to get them in the right order.
Leigh Klotz: That's what I said.
Steven Pemberton: By static analysis?
Erik Bruchez: Assume more than one item price; it matters whether you point to the first or second item. Without the data, you don't know how many items there are. You don't know the actual path. Without a schema, you can know even less than if you have data. With static analysis, you can do less than you can do with the current algorithm, but you can do more in a way because you can look at every expression and determine which instances are touched and which elements are touched. You can determine when no rebuild is necessary.
John Boyer: Why would you need a rebuild if you don't need the dependency graph?
Erik Bruchez: For the binds, we have a runtime tree structure that points to instance nodes, similar to the control tree. We can't rebuild that incrementally. So when there is a structural change we rebuild the nodesets from the XPath expressions.
Erik Bruchez: I was talking about something different, not an alternative version of the spec algorithm. I'm talking about something that doesn't achieve as much in theory as the current algorithm. It can't determine order of evaluation. ... It can determine more about performance. The spec says you should behave as if, so this idea is in no-mans land.
John Boyer: The spec says that the order doesn't matter.
Leigh Klotz: Only for calculate.
John Boyer: Yes. If we expose MIP values though then it matters generally.
Erik Bruchez: It's a very strong requirement. It's good if you have that to implement. Here it seems like we're specifying something that causes surprises to authors. The issues with if...then shows we don't quite cover it and surprises still occur and require manual intervention.
John Boyer: Because of dynamic changes, the dependency graph becomes stale. It might be particularly surprising that the recalculate makes the dependency graph stale again. In order to fully make declarative expressions work, where you aren't dependent on the order of the binds, the XForms processor has to detect and run indeterminate number of rebuilds. For us, it's not that different than the brute force algorithm (run all binds until quiescence).
Erik Bruchez: That's a way. Everybody has a slightly different perspective. We almost never needed the spreadsheet-like dependency. If you take the famous PO example, you can run everything in order.
John Boyer: That size of form isn't a problem. Most of our forms have a lot more binds in them.
Erik Bruchez: How often does the order matter?
John Boyer: I can't actually say but I do know that our forms tend to model processes that have already been done with spreadsheets, and by the time you get to a spreadsheet-size it matters.
Steven Pemberton: I recall Leigh saying he had a form with 10,000 binds and I'm sure the order matters there.
Erik Bruchez: There are few cases where we have calculations. readonly is a leaf node; nobody at this point depends on it. (We have a function, but it doesn't affect the dependencies so you have to use it appropriately.) As long as you do it in a reasonable order, recalcuate,relevant,required,readonly, pretty much everything turns out find.
John Boyer: So, a lot of your forms don't have surprising rebuild issues anyway. We are hitting situations where people are surprised by the need to do a rebuild, because that means order matters. If there's a way to sort it out in advance, that what's the rebuild does. The problem we hit is what happens when the order is wrong.
Erik Bruchez: I'm not saying you can always do that, but in our scenarios it's not a problem, at least not today. We haven't encountered it yet. On the other hand, the issue that if you want to achieve a system with proper ordering, you hit some technical issues (e.g. if...then), it sounds hard to implement, unless we're able to tell the user that we have an automatic mode with expression limitations. Can an implementor do anything they want?
John Boyer: The problem with the spec right now is that we try to say that people can implement alternatives, but at the F2F, we went overboard and tried to define how references are collected in the non-appendix. I think the reference collection mechanism is useful for UI bindings, but creates a scenario with false circular references in the model bindings and we've hit that.
Steven Pemberton: My feelins is that we haven't learnt our own lessons about the value of declarative specification. We've defined certain things not by their results, but by the method used to achieve the results.
John Boyer: I quite agree; the challenge though is to decide how much to say. We had interoperability problems. At the F2F we had a 45%-55% split (and I was on the losing side) where we changed the reference calculation. I couldn't think of the counterexample but now we have it. It's now water under the bridge. We'll explain the problem and the alternative. Steven points out we need to find a way to get rid of rebuild() entirely.
Steven Pemberton: Exactly
John Boyer: Whatever that means. But that can be indeterminate and take a long time. It should be possible to detect when a set of references changed during a refresh. The XForms processor could detect and do a rebuild, then a recalculate, then a rebuild, but not infinite.
Leigh Klotz: Are you sure it can't be infinite?
John Boyer: I think it can't be infinite if you don't have circular references.
Erik Bruchez: ...
John Boyer: We could say to do it at most 10 times. Or 20. If you're more than that deep in logic then it's almost certainly circular. Perhaps we throw an exception.
Steven Pemberton: That sounds a bit kludgey. In our system, we discovered not circularity but math precision errors; you'd get ping-ponging between two real values.

Erik Bruchez: I'd like to get back to the "how" and the "what". If the "what" is what it is now, which is "determine order and detect rebuilds automagically," I feel that is too ambitious. If there is an algorithm that works, we may fall into the too-specific trap. Given that there's a lot of history and forms benefit from determing order...if we drop that requirement, things become simpler. You can then specify the "what" very simply. First, just do brute force re-evaluation; then you can make things go faster with optimizations (static analysis, runtime analysis, etc, and you don't need an algorithm). If you keep the ordering part, I don't know if you can specify something that is a "what" that has multiple (or even any) ways to implement.
John Boyer: You push the form work off to the form author. We say we can't figure out automatically what order to put them in, but the only places where we can't figure it out are the places where it's dynamic and the form author can't put them in a static order anyway.

Steven Pemberton: This has been a useful discussion. There's grist for the mill. I liked that we've found some direction and have details. We've identified issues and possible ways to solve them.

Nick van: I won't be here next week.

* IRC Minutes

missing due to snow