- From: Sebastian Feuerstack <Sebastian@Feuerstack.org>
- Date: Tue, 28 Aug 2012 10:51:53 -0300
- To: www-multimodal@w3.org
Hey there, by the announcement of the W3C Proposed Recommendation status of the MMI architecture I got aware of the recent changes to the specification that I would like to comment. First of all, I would like to express my appreciation to the progress that has been made. Every step in the direction of standardization of interchangeable components of a multimodal system will help to tackle the system complexity of current and future multimodal architectures and hopefully ease further advancements. My background is in research about model-based development of multimodal systems and more specifically in using gestures, hand poses, and body movements to control web applications. My current research project is about a web platform to design and run multimodal interfaces - http://www.multi-access.de - earlier I was involved in the MASP/Sercho project http://masp.dai-labor.de). After reading through the document I would like to propose to re-think the overall structure of the document and specifically the overview section: it’s really hard to capture the overall focus of the specification. I think it would help if the document starts with an introduction that is clearer about the general idea and content of the spec: the definition of component lifecycles and their coordination using a set of standardized events. Also an abstract sequence diagram (similar to the ones in the end) would help to improve understanding in the beginning of the document. The link to the mentioned Galaxy architecture seems to be broken. Regarding the lifecycle events/protocol I had problems figuring out if there are any assumptions made regarding the interaction relevant data of the modality components that needs to be fused, processed and thereafter distributed to the media. While reading the spec I had the feeling that the modalities considered are all of a discrete nature – but what about direct manipulation and pointing or multi-touch or continuous gestures? Can streams between components be established and managed by the IM or is this out of scope for this spec? Regarding the following terms and references I have some specific remarks: MVC - I do neither agree that this is a "recent" approach (it was suggested in 1979 for Smalltalk) nor that the MVC is really related to this spec. MVC is often referred to because of its "separation of concerns", but it also defines dependencies between this components that do not match the proposed architecture and that are often misunderstood (see e.g. Martin Fowler). In practice a MVC-based system architecture easily "fragments" into a whole bunch of interrelated MVC triples that are complex to maintain. MVC strictly separates input from output, which has been identified as a problem for multimodal systems. Maybe its worth to take a look at the ideas of Presentation-Abstraction-Control (PAC) by Joëlle Coutaz et al. PAC and PAC Amodeus implement ideas like the "russian doll" and the "nested IM". Context - A an "outsider" it have not followed the discussion, but isn`t it a "session" that can be joined by users and transferred between modalities that is specified? The word "context" has been stressed a lot (at least in science) and in my opinion complicates understanding this spec, since multimodal systems already rely on concepts, such as the "context-of-use" (Gaelle Calvary et al.), which could be confusing. Transport Protocols - I understood that one basic idea of the specification is to be as abstract and adaptable to different e.g. transport systems but I have not understood why HTTP request/response has been chosen (or proposed?) as one suitable solution. Wouldn’t it be easier to use a stateful protocol for such an approach? Otherwise each participating component has to manage and recover the communication state internally? Why are SIP, XMPP or to some extend WebSockets not mentioned? Kind reagrds, Sebastian -- Sebastian Feuerstack Department of Computer Science Federal University of Sao Carlos - Brazil http://www.feuerstack.org Check out MINT 2010 - the Multimodal INTeraction Framework http://www.multi-access.de
Received on Tuesday, 28 August 2012 13:52:24 UTC