VoiceXML Review by DOM WG

Introduction

Members of the DOM working Group have read through the Last Call draft of the Voice Markup Language specification, looking for issues related to the DOM specification.  No member of the group has actively followed the specification, so our reading was undertaken from a position of knowing nothing about the specification.

Here are issues we noticed that we felt should be documented.

VoiceXML Events as DOM Events

Section 5.2 on event handling claims that "An interpreter may implement VoiceXML event handling using a DOM 2 event processor".  It is difficult to see how this is true, and the following sub-issues are examples of why this is not true.

Handler Order

Later in the document, section 5.2.4 states that the event delivery algorithm is described as a constrained version of XML Events and DOM 2 event processing, where the catch events are explicitly ordered by document order.  This makes impossible to implement VoiceXML event handling using a normal DOM 2 event processor in any reasonable fashion.

Canceling on Current Level

Also, section 5.2.4 states that an event handler which handles an event stops propogation of the event, and implies that other event handlers declared on the same element will not be called.  While DOM event handling has the ability to cancel handlers declared on ancestor nodes, all handlers will always still be called on a single node if any handlers are called on that node regardless of cancelling that occurs during delivery.

Interoperable ECMAScript in Compound Documents

Expect combination of VoiceXML with other markup such as, XHTML, SVG, SSML, etc. when defining multimodal presentations.  In such cases, ECMAScript throughout the document should be consistent and interoperable.  In this case, we would expect content authors call functions in the global scope throughout the document and access all parts of the document through DOM, register event handlers, etc.

The intertwining of ECMAScript scopes and VoiceXML-based declaration of variables visible to ECMAScript, as described in section 5.1, is unusual.  Ignoring implementation issues, it seems like it could cause usage problems.  For example, if a script uses DOM to add an event handler, how does the event handler script get access to the field values it needs to get or set to respond to the event?  If a script tries to access or modify a field value through DOM, how does that relate to the in-scope variable?