W3C home > Mailing lists > Public > public-webapps@w3.org > October to December 2009

RE: Multimodal Interaction WG questions for WebApps (especially WebAPI)

From: Deborah Dahl <dahl@conversational-technologies.com>
Date: Fri, 23 Oct 2009 14:17:08 -0400
To: <Ingmar.Kliche@telekom.de>, <Olli.Pettay@helsinki.fi>
Cc: <public-webapps@w3.org>, <w3c-mmi-wg@w3.org>
Message-ID: <03b001ca540d$09667ca0$6801a8c0@chimaera>
Just a quick follow-up about WebSockets -- do you have
any sense of when implementations might start to
be available in browsers?

> -----Original Message-----
> From: w3c-mmi-wg-request@w3.org 
> [mailto:w3c-mmi-wg-request@w3.org] On Behalf Of 
> Ingmar.Kliche@telekom.de
> Sent: Friday, October 23, 2009 10:08 AM
> To: Olli.Pettay@helsinki.fi
> Cc: public-webapps@w3.org; w3c-mmi-wg@w3.org
> Subject: Re: Multimodal Interaction WG questions for WebApps 
> (especially WebAPI)
> 
> Olli,
> 
> thanks for pointing this out. The Multimodal WG has looked into whats
> available on WebSockets and indeed it seems to be a good 
> candidate to be
> used as a transport mechanic for distributed multimodal 
> applications.  
> 
> -- Ingmar. 
> 
> > -----Original Message-----
> > From: Olli Pettay [mailto:Olli.Pettay@helsinki.fi] 
> > Sent: Thursday, September 24, 2009 10:19 AM
> > To: Deborah Dahl
> > Cc: public-webapps@w3.org; 'Kazuyuki Ashimura'
> > Subject: Re: Multimodal Interaction WG questions for WebApps 
> > (especially WebAPI)
> > 
> > On 9/24/09 4:51 PM, Deborah Dahl wrote:
> > > Hello WebApps WG,
> > >
> > > The Multimodal Interaction Working Group is working on 
> > specifications
> > > that will support distributed applications that include 
> inputs from
> > > different modalities, such as speech, graphics and handwriting. We
> > > believe there's some applicability of specific WebAPI specs such
> > > as XMLHttpRequest and Server-sent Events to our use cases 
> and we're
> > > hoping to get some comments/feedback/suggestions from you.
> > >
> > > Here's a brief overview of how Multimodal Interaction and WebAPI
> > > specs might interact.
> > >
> > > The Multimodal Architecture [1] is a loosely coupled 
> > architecture for
> > > multimodal user interfaces, which allows for co-resident 
> > and distributed
> > > implementations. The aim of this design is to provide a 
> > general and flexible
> > > framework providing interoperability among 
> > modality-specific components from
> > > different vendors - for example, speech recognition from 
> > one vendor and
> > > handwriting recognition from another. This framework 
> > focuses on providing a
> > > general means for allowing these components to communicate 
> > with each other,
> > > plus basic infrastructure for application control and 
> > platform services.
> > >
> > > The basic components of an application conforming to the 
> Multimodal
> > > Architecture are (1) a set of components which provide 
> > modality-related
> > > services, such as GUI interaction, speech recognition and 
> > handwriting
> > > recognition, as well as more specialized modalities such as 
> > biometric input,
> > > and (2) an Interaction Manager which coordinates inputs 
> > from different
> > > modalities with the goal of providing a seamless and 
> well-integrated
> > > multimodal user experience. One use case of particular 
> interest is a
> > > distributed one, in which a server-based Interaction 
> > Manager (using, for
> > > example SCXML [2]) controls a GUI component based on a 
> > (mobile or desktop)
> > > web browser, along with a distributed speech recognition 
> component.
> > > "Authoring Applications for the Multimodal Architecture" 
> > [3] describes this
> > > type of an application in more detail. If, for example, 
> > speech recognition
> > > is distributed, the Interaction Manager receives results 
> > from the recognizer
> > > and will need to inform the browser of a spoken user input 
> > so that the
> > > graphical user interface can reflect that information. For 
> > example, the user
> > > might say "November 2, 2009" and that information would be 
> > displayed in a
> > > text field in the browser. However, this requires that the 
> > server be able to
> > > send an event to the browser to tell it to update the 
> > display. Current
> > > implementations do this by having the brower poll for the 
> server for
> > > possible updates on a frequent basis, but we believe that a 
> > better approach
> > > would be for the browser to actually be able to receive 
> > events from the
> > > server.
> > > So our main question is, what mechanisms are or will be 
> available to
> > > support efficient communication among distributed components (for
> > > example, speech recognizers, interaction managers, and 
> web browsers)
> > > that interact to create a multimodal application,(hence 
> our interest
> > > in server-sent events and XMLHttpRequest)?
> > 
> > I believe WebSockets could work a lot better than XHR or 
> server-sent 
> > events. IM would be a WebSocket server and it would have 
> > bi-directional
> > connection to modality components.
> > 
> > -Olli
> > 
> > 
> > 
> > >
> > > [1] MMI Architecture: http://www.w3.org/TR/mmi-arch/
> > > [2] SCXML: http://www.w3.org/TR/scxml/
> > > [3] MMI Example: http://www.w3.org/TR/mmi-auth/
> > >
> > > Regards,
> > >
> > > Debbie Dahl
> > > MMIWG Chair
> > >
> > >
> > >
> > 
> > 
> 
> 
Received on Friday, 23 October 2009 18:15:53 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:34 GMT