- From: Dan Burnett <dburnett@voxeo.com>
- Date: Tue, 28 Jun 2011 18:13:44 -0400
- To: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
- Message-Id: <A083F622-523A-42C6-A4FA-C3B62C1A27B7@voxeo.com>
Group, The minutes from the last call are available at http://www.w3.org/2011/06/16-htmlspeech-minutes.html. For convenience, a text version is embedded below. Thanks to Patrick Ehlen for taking the minutes! -- dan ********************************************************************************** [1]W3C [1] http://www.w3.org/ - DRAFT - HTML Speech Incubator Group Teleconference 16 Jun 2011 [2]Agenda [2] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Jun/0033.html See also: [3]IRC log [3] http://www.w3.org/2011/06/16-htmlspeech-irc Attendees Present Milan_Young, Michael_Johnston, Dan_Burnett, Michael_Bodell, Olli_Pettay, Dan_Druta, Charles_Hemphill, Patrick_Ehlen, Robert_Brown Regrets Raj_Tumuluri, Bjorn_Bringert Chair Dan_Burnett Scribe Patrick_Ehlen Contents * [4]Topics 1. [5]New design decisions? 2. [6]markup binding 3. [7]discussion time 4. [8]do we need to support audio recording with recognition? 5. [9]what are the built-ins, and what does that mean? * [10]Summary of Action Items _________________________________________________________ New design decisions? robert: is audio recording without recognition be supported? are there important scenarios for supporting recording without recognition <burn> satish, any update on markup binding? markup binding <satish> burn: None, Bjorn was collecting input from the chrome team and since he has gone on leave I have no contact on what the status was. <burn> satish, can you please check? we are not waiting on the answer, but it would be nice to have the input robert: google issue on whether there should be a button to press <satish> burn: yes, I can take an action to get a definitive answer in the next few days. burn: satish will take this on w/ the chrome team discussion time do we need to support audio recording with recognition? burn: an advantage could be endpointing. ... is that an important criteria in this case as well? charles: another question is how real-time is the reco response? ... a recording may result in reco later ... an identifier might later associate the recording with a reco transcription burn: brings up question of whether we support reco on recorded audio robert: garbage models could be used to make recording in edge cases ... "overloading" recognition ... or will recording be a more common task ... Do we think recording with endpointing is important? milan: channel adaptation, sharing headers in same structure, parameters could be reused; sharing the same network paths -- convenient to use same Charles: Also, the on-line vs. off-line cases milan: would most recording be associated with an attempt to understand the text in the recording? burn: Most significant feature is the endpointing milan: in that case, why not just use dict model, do reco, and save the waveform as backup? ... and how common would that be. If not so common, could use a garbage model (even a "first-class" one) burn: seems strange to call recording a weird special case of reco ... in favor of using the recording resource as described in mrcp robert: though endpointing may be valuable, would we support a "record" object in the API? how would this go all the way to the developer? burn: does not seem to be in our scope olli: there are other proposals that would handle recording charles: channel adaptation burn: channel normalization is not a valid reason for recording support charles: should probably also include built-in record grammar (milan above) milan: use case: may want to to do dictation in parallel with c&c ... e.g., provide a c&c followed immediately by dictation burn: but does that really belong as a built-in type in a grammar? ... sounds like there is not real consensus today vis-a-vis supporting a recording capability robert: have not heard a compelling reason to support recording burn: consensus not to do it now milan: would like a standard way to do it, should the need arise burn: we could state that we reserve this for the future milan: there should be some consistent and portable way to do this across engines robert: could be done as a proprietary extension milan: at least provide a consistent hack, like builtin:record robert: that's what the garbage model recording would be milan: that's fine, as long as all engines support this type of garbage model burn: to summarize, can't agree on specific recording scenarios (robert above) scribe: should agree on supporting garbage-recording scenario burn: as a group, agree not to define an explicit recording capability at this time. ... can be supported using a garbage model, or capabilities defined outside this group what are the built-ins, and what does that mean? milan: existing builtins: dictation, search, address, numbers robert: already agreed there should be a certain set of predefined grammars ... so how do we refer to those? burn: 2 things make builtins interesting: (1) parameterization; (2) no language is required milan: markup already has certain defined types, parameters, etc, as native to HTML5. Would make sense to pay attention to that here burn: an unconstrained text box should naturally bind to a dictation model milan: should we remap the names of the builtins? burn: argue strongly for using html as a starting point robert: These should be builtins, not re-used vxml grammars <smaug> could someone paste a link to voicexml's builtin grammars ? charles: they've become a de facto standard; not supporting them is awkward <Robert> these are the HTML input types: [11]http://www.w3.org/TR/html5/the-input-element.html#attr-input-typ e [11] http://www.w3.org/TR/html5/the-input-element.html#attr-input-type burn: if someone wants to support legacy builtins in a way that doesn't break existing builtins, that's not a problem <Robert> perhaps have builtins that match these charles: there needs to be some way to include these (milan above) scribe: is there something about this that can't be represented by a query string? michael: do you want to reference, for example, an html number type, or some arbitrary number? milan: easier to use old builtins & augment them charles: need to look at greater good of using html vs vxml <mbodell> Widely implemented? See [12]http://en.wikipedia.org/wiki/URI_scheme [12] http://en.wikipedia.org/wiki/URI_scheme burn: michael, how would you reference grammars that are assoc. with html input types? michael: an html ruleref, with various attributes; or don't specify URI and ref them by markeup AP... ... most important is associating grammars with individual input elements ... not a strong use case to have URIs for these things, or ability for user to write their own that reference these burn: when people want to hack something up quickly, common input types should lend themselves to being included as part of a larger utterance michael: may be other ways to specify input for that type of scenarios burn: maybe reference not the grammar but the input type itself charles: similar input types not always require the same grammar burn: but the app author may want a way to link these different types of builtin grammars together milan: perhaps just do the proposal burn: who on the call is interested in builtin models? charles: interested in it; this group seems focused on web search and dictation, as opposed to broader html cases <mbodell> <input type="search" name="q" speech required onspeechchange="startSearch"> michael: there will probably be a standard set of grammar libraries, though perhaps the market will provide those johnston: can't see us requiring something like a "zip code" lib, for internationalization reasons michael: HTML has already handled a lot of these issues (milan above) (michael, above, actually) milan: should there be an html binding? michael: would be better if you could speech enable certain input types with little work robert: if no builtins were specified, what are the consequences? burn: if you want broad adoptability and usage, it needs to be as easy to create simple apps as vxml robert: we need it to do the html binding. ... so how much do we need the html binding part? milan: definitely need the capability to specify search, dictation, etc. robert: that's different from looking at html input types, etc. that's a complex problem milan: would like to have a notion of how to solve binding problem before we do dictation robert: does anyone have a proposal to volunteer? milan: perhaps can do it after I get the dictation stuff out micheal: there is a topic in the API about markup bindings. burn: true that it's a binding issue ... without a proposal, it doesn't happen. ... so it will be up to someone to write a proposal milan: perhaps sending a message to google on this robert: or to satish burn: action item for milan to talk with satish and ask for help on structuring a proposal ... reminder: no call next week robert: but there will be a protocol meeting
Received on Tuesday, 28 June 2011 22:10:58 UTC