- From: Dan Burnett <dburnett@voxeo.com>
- Date: Thu, 15 Sep 2011 13:46:35 -0400
- To: public-xg-htmlspeech@w3.org
Group,
The minutes from today's call are available at http://www.w3.org/2011/09/15-htmlspeech-minutes.html.
For convenience, a text version is embedded below.
Thanks to Glen Shires (again) for taking the minutes.
-- dan
**********************************************************************************
           HTML Speech Incubator Group Teleconference
15 Sep 2011
   [2]Agenda
      [2] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0025.html
   See also: [3]IRC log
      [3] http://www.w3.org/2011/09/15-htmlspeech-irc
Attendees
   Present
          Dan_Burnett, Michael_Bodell, Olli_Pettay, Debbie_Dahl,
          Bjorn_Bringert, Satish_Sampath, Robert_Brown,
          Michael_Johnston
   Regrets
   Chair
          Michael_Bodell
   Scribe
          Satish_Sampath, Glen_Shires
Contents
     * [4]Topics
         1. [5]Section 7
         2. [6]Media Stream Input
         3. [7]new meeting
     * [8]Summary of Action Items
     _________________________________________________________
   <mbodell>
   [10]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep
   /0026.html
     [10] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0026.html
   Discussing draft
   [11]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep
   /att-0008/speechwepapi.html
     [11] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/att-0008/speechwepapi.html
   Robert: in writing code, found using constructors wherever possible
   simplifies it
   ... see code in
   [12]http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep
   /0026.html
   ... service query object, etc is overblown
   ... if remote service, then need to create a service object
   ... to pass into constructor of SpeechInput request
   ... (read the email for better description)
     [12] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0026.html
   bringert: typically just want to use default ... or if want specific
   speech service
   ... in either case, fundamentally same API
   ... two cases: default service or specific service. In either case,
   ask for a feature and it either succeeds or fails.
   burn: security/authentication is easier in this model
   ... not required for default
   Olli: how to handle permissions (old api had error callback)
   bringert: not getting permission is just another type of failure
   ... (permissions from user)
   Robert: rather than overload constructor, use a 2-stage approach:
   constructor + open function
   ... in open function: use this URI, use default, etc
   ... async callback "I'm ready"
   ... on SpeechInputRequest
   bringert: create, open, start/initialize (3 steps)
   Robert: can stack up language, grammar
   bringert: create, open/initialize, start (3 steps)
   ... setup for remote servers
   satish: start to set quality
   ... start to set qualities
   ... open callback indicates success/fail
   <mbodell> I'm not sure createFoo is that unstandard. Check out
   [13]https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification
   .html#AudioContext-section for an example of an object with a lot of
   createFoos
     [13] https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#AudioContext-section
   satish: whether supports capabilities
   mbodell: SpeechService
   bringert: new SpeechService, SpeechService.Open
   Robert: old days: classes to create objects, had twice as many
   classes as needed
   ??1: what params in 3 steps?
   bringert: create is constructor (no params or URL)
   satish: or SpeechServiceObject
   Robert: now I'm thinking a flat model, one object
   bringert: open has success and error callbacks
   ... attributes, URI
   ... create, initialize, start - no arguments, only attributes
   satish: same or different handlers?
   ... if tied to function call, should be params. if lifetime of
   object, then attributes
   <bringert> new SpeechinputRequest(), .init(success, error),
   .start(success, error)
   bringert: args should be: create(void), init(successcallback,
   errorcallback), start(successCallback, errorCallback)
   ... set all params at once and check once if works
   ???: as Robert said, simplify common use case
   <mbodell> s/\?\?\?:/Olli:/
   Olli: can re-use a SpeechInputRequest object for different language>
   bringert: should be .setlanguage then .init()
   Olli: so no longer binding SpeechInputRequest to ??? -- permission
   handling may be more ugly
   Robert: permissions problem hasn't yet been solved in doc
   ... oh that permission, retract
   mbodell: reco should have success/error callbacks - exact semantics
   of .init/.start and what can change between them not clear yet
   ... old model: before .init, SpeechService completes before set
   individual properties
   bringert: we don't need .init, just let .start do it, and have
   .willItWork()
   satish: .start creates connection
   mbodell: I think satish suggests start/reco and then state-change
   callback
   bringert: use case: can I show mic button - based on .willItWork()
   mbodell: does .willItWork consider user permissions?
   ... don't want to prompt user to ask
   bringert: yes, .willItWork() could contact server (and not send
   data) and check if it technically will work (independent of user
   permissions)
   satish: service is third-party. Can browser inquire "do you
   support?" Maybe if don't identify domain.
   bringert: don't need .init, just .start and .canStart
   ... .init is not mandatory (because .start can call .init)
   satish: .init is still there, developer can call it, but not
   required to
   bringert: so fewer callbacks need to be implemented
   <bringert> new SpeechinputRequest(), .init(), .start() (calls init()
   if not called already)
   <bringert> callbacks are attributes (event handlers) on
   SpeechInputRequest
   <mbodell> and some results and errors are DOM events (which may have
   onevent handlers)
   bringert: .start() may call .init()
   burn: simplifies what the developer is required to do
   Robert: someone should volunteer to write sample code and then IDL
   mbodell: about to ask for a volunteer
   ... at least to level of other proposals
   satish: I volunteer
   Robert: first a code sample, then IDL
   mbodell: so this changes sections 3 and 4
Section 7
   satish: attributes for grammars
   ddahl: what if URI changes after set grammar object
   ??3: sevice may load at call time, so if changes in meantime,
   service may not load until next invocation
   ddahl: what if content of URI changes after set grammar object
   bringert: URI may point to a dynamic resource, so for every reco
   request, the service should check to see if changed
   mbodell: setting it doesn't freeze the content. service does HTTP
   for fetching/caching/etc
   satish: service may cache when setGrammar
   burn: service is responsible for fetching/caching -- not the browser
   bringert: use semantics of URI to fetch and cache (for example, HTTP
   has semantics)
   burn: author may have a way to update, but not part of browser's job
   mbodell: that covers grammars, moving on to other parameters
   satish: object.set...
   mbodell: name/value string, can be more than one
   <Charles> Consider JSON: [14]http://en.wikipedia.org/wiki/JSON
     [14] http://en.wikipedia.org/wiki/JSON
   satish: is that more useful than one custom field
   <mbodell> change setParameter to setCustomParameter
   MJ: create new SpeechInputRequest then .setGrammar -- does this
   require communication with speech service to set up
   mbodell: semantics are collect info then talk to speech service
   MJ: request grammar to be activated before start audio
   <mbodell> s/\?\?4:/MJ:/g
   mbodell: .init may do this
   ... so when call .start(), it's ready
   ... main use case: grammars can take a long time, so improves
   latency when call .start()
   ... MRCP (and others) allow defining grammars early
   bringert: .addGrammar, .addGrammar .init(), then .start()
   burn: change grammars?
   mbodell: reInit?
   bringert: .addGrammar and .activateGramar
   ... specifying grammar, and starting to use grammar
   mbodell: example .addGrammar, .init, .addGrammar, .start
   satish: why allow that?
   mbodell: some initialized, not others
   bringert: so .start could call .init again
   mbodell: some params shouldn't change after .init (like URI), but
   (in my mind) some should (like timeout)
   ... .init connects to service (handles those issues), but shouldn't
   tie down other params
   bringert: close (inert), pause, resume
   ... hard to remember what can change when, so my preference if
   anything gets changed between .init and .start, then .init gets
   called again
   ... in close state can change anything, in pause state can change
   some things and .init gets called again, in run state things don't
   change immediately
   Robert: not everything can be changed
   bringert: some changes may affect user-consent
   ... perhaps split params (service UI and everything else) or just
   one bunch, would be hard to remember more categories
   ... service UI could be only in constructor object
   MJ: set up service in beginning, then user interaction later
   ... if set up a bunch of params, then change service
   mbodell: changing URI can change permission model
   <smaug> Argh, I run out of skype credits
   mbodell: a separate SpeechInputRequest may be cleaner
   <mbodell> make clear in doc that only init and start/reco actually
   communicate with server
Media Stream Input
   mbodell: does .start mean start at this point, or is it buffered?
   satish: permissions model may handle media capture differently than
   contacting speech service
   mbodell: media capture permissions might encompass both
   bringert: separate issue of sending data to a third party
   burn: that's the point at which contact the speech service
   mbodell: any agreement on if set up streaming, but haven't set up
   reco, does it buffer?
   <mbodell>
   [15]http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-fil
   e/tip/StreamProcessing/StreamProcessing.html#mediastream-extensions
     [15] http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-file/tip/StreamProcessing/StreamProcessing.html#mediastream-extensions
   bringert: if file, always beginning of file, if live (real-time like
   microphone) then no buffering
   ... if want to buffer, developer can write code to do that
   mbodell: so agree: no buffer
   <mbodell> no buffer for media stream, we start the reco when the
   start/reco is called, and it starts listening to the stream only
   then
   bringert: yes, audio api may have own buffering, or could even use
   javascript
new meeting
   burn: Olli asked me at beginning if a meeting at TPAC
   ... even if wrapped up this report, we may have more to discuss
   ... unlikely a formal meeting/discussion, but may be
   useful/relevant/informal discussions
   mbodell: and maybe formal if needed
   burn: may have to slip call schedule out a week
   mbodell: and document
   burn: that's it, next call is next week, thanks everyone
Received on Thursday, 15 September 2011 17:47:15 UTC