- From: Dan Burnett <dburnett@voxeo.com>
- Date: Tue, 28 Jun 2011 18:13:44 -0400
- To: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
- Message-Id: <A083F622-523A-42C6-A4FA-C3B62C1A27B7@voxeo.com>
Group,
The minutes from the last call are available at http://www.w3.org/2011/06/16-htmlspeech-minutes.html.
For convenience, a text version is embedded below.
Thanks to Patrick Ehlen for taking the minutes!
-- dan
**********************************************************************************
[1]W3C
[1] http://www.w3.org/
- DRAFT -
HTML Speech Incubator Group Teleconference
16 Jun 2011
[2]Agenda
[2] http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Jun/0033.html
See also: [3]IRC log
[3] http://www.w3.org/2011/06/16-htmlspeech-irc
Attendees
Present
Milan_Young, Michael_Johnston, Dan_Burnett, Michael_Bodell,
Olli_Pettay, Dan_Druta, Charles_Hemphill, Patrick_Ehlen,
Robert_Brown
Regrets
Raj_Tumuluri, Bjorn_Bringert
Chair
Dan_Burnett
Scribe
Patrick_Ehlen
Contents
* [4]Topics
1. [5]New design decisions?
2. [6]markup binding
3. [7]discussion time
4. [8]do we need to support audio recording with recognition?
5. [9]what are the built-ins, and what does that mean?
* [10]Summary of Action Items
_________________________________________________________
New design decisions?
robert: is audio recording without recognition be supported?
are there important scenarios for supporting recording without
recognition
<burn> satish, any update on markup binding?
markup binding
<satish> burn: None, Bjorn was collecting input from the chrome team
and since he has gone on leave I have no contact on what the status
was.
<burn> satish, can you please check? we are not waiting on the
answer, but it would be nice to have the input
robert: google issue on whether there should be a button to press
<satish> burn: yes, I can take an action to get a definitive answer
in the next few days.
burn: satish will take this on w/ the chrome team
discussion time
do we need to support audio recording with recognition?
burn: an advantage could be endpointing.
... is that an important criteria in this case as well?
charles: another question is how real-time is the reco response?
... a recording may result in reco later
... an identifier might later associate the recording with a reco
transcription
burn: brings up question of whether we support reco on recorded
audio
robert: garbage models could be used to make recording in edge cases
... "overloading" recognition
... or will recording be a more common task
... Do we think recording with endpointing is important?
milan: channel adaptation, sharing headers in same structure,
parameters could be reused; sharing the same network paths --
convenient to use same
Charles: Also, the on-line vs. off-line cases
milan: would most recording be associated with an attempt to
understand the text in the recording?
burn: Most significant feature is the endpointing
milan: in that case, why not just use dict model, do reco, and save
the waveform as backup?
... and how common would that be. If not so common, could use a
garbage model (even a "first-class" one)
burn: seems strange to call recording a weird special case of reco
... in favor of using the recording resource as described in mrcp
robert: though endpointing may be valuable, would we support a
"record" object in the API? how would this go all the way to the
developer?
burn: does not seem to be in our scope
olli: there are other proposals that would handle recording
charles: channel adaptation
burn: channel normalization is not a valid reason for recording
support
charles: should probably also include built-in record grammar
(milan above)
milan: use case: may want to to do dictation in parallel with c&c
... e.g., provide a c&c followed immediately by dictation
burn: but does that really belong as a built-in type in a grammar?
... sounds like there is not real consensus today vis-a-vis
supporting a recording capability
robert: have not heard a compelling reason to support recording
burn: consensus not to do it now
milan: would like a standard way to do it, should the need arise
burn: we could state that we reserve this for the future
milan: there should be some consistent and portable way to do this
across engines
robert: could be done as a proprietary extension
milan: at least provide a consistent hack, like builtin:record
robert: that's what the garbage model recording would be
milan: that's fine, as long as all engines support this type of
garbage model
burn: to summarize, can't agree on specific recording scenarios
(robert above)
scribe: should agree on supporting garbage-recording scenario
burn: as a group, agree not to define an explicit recording
capability at this time.
... can be supported using a garbage model, or capabilities defined
outside this group
what are the built-ins, and what does that mean?
milan: existing builtins: dictation, search, address, numbers
robert: already agreed there should be a certain set of predefined
grammars
... so how do we refer to those?
burn: 2 things make builtins interesting: (1) parameterization; (2)
no language is required
milan: markup already has certain defined types, parameters, etc, as
native to HTML5. Would make sense to pay attention to that here
burn: an unconstrained text box should naturally bind to a dictation
model
milan: should we remap the names of the builtins?
burn: argue strongly for using html as a starting point
robert: These should be builtins, not re-used vxml grammars
<smaug> could someone paste a link to voicexml's builtin grammars ?
charles: they've become a de facto standard; not supporting them is
awkward
<Robert> these are the HTML input types:
[11]http://www.w3.org/TR/html5/the-input-element.html#attr-input-typ
e
[11] http://www.w3.org/TR/html5/the-input-element.html#attr-input-type
burn: if someone wants to support legacy builtins in a way that
doesn't break existing builtins, that's not a problem
<Robert> perhaps have builtins that match these
charles: there needs to be some way to include these
(milan above)
scribe: is there something about this that can't be represented by a
query string?
michael: do you want to reference, for example, an html number type,
or some arbitrary number?
milan: easier to use old builtins & augment them
charles: need to look at greater good of using html vs vxml
<mbodell> Widely implemented? See
[12]http://en.wikipedia.org/wiki/URI_scheme
[12] http://en.wikipedia.org/wiki/URI_scheme
burn: michael, how would you reference grammars that are assoc. with
html input types?
michael: an html ruleref, with various attributes; or don't specify
URI and ref them by markeup AP...
... most important is associating grammars with individual input
elements
... not a strong use case to have URIs for these things, or ability
for user to write their own that reference these
burn: when people want to hack something up quickly, common input
types should lend themselves to being included as part of a larger
utterance
michael: may be other ways to specify input for that type of
scenarios
burn: maybe reference not the grammar but the input type itself
charles: similar input types not always require the same grammar
burn: but the app author may want a way to link these different
types of builtin grammars together
milan: perhaps just do the proposal
burn: who on the call is interested in builtin models?
charles: interested in it; this group seems focused on web search
and dictation, as opposed to broader html cases
<mbodell> <input type="search" name="q" speech required
onspeechchange="startSearch">
michael: there will probably be a standard set of grammar libraries,
though perhaps the market will provide those
johnston: can't see us requiring something like a "zip code" lib,
for internationalization reasons
michael: HTML has already handled a lot of these issues
(milan above)
(michael, above, actually)
milan: should there be an html binding?
michael: would be better if you could speech enable certain input
types with little work
robert: if no builtins were specified, what are the consequences?
burn: if you want broad adoptability and usage, it needs to be as
easy to create simple apps as vxml
robert: we need it to do the html binding.
... so how much do we need the html binding part?
milan: definitely need the capability to specify search, dictation,
etc.
robert: that's different from looking at html input types, etc.
that's a complex problem
milan: would like to have a notion of how to solve binding problem
before we do dictation
robert: does anyone have a proposal to volunteer?
milan: perhaps can do it after I get the dictation stuff out
micheal: there is a topic in the API about markup bindings.
burn: true that it's a binding issue
... without a proposal, it doesn't happen.
... so it will be up to someone to write a proposal
milan: perhaps sending a message to google on this
robert: or to satish
burn: action item for milan to talk with satish and ask for help on
structuring a proposal
... reminder: no call next week
robert: but there will be a protocol meeting
Received on Tuesday, 28 June 2011 22:10:58 UTC