- From: Satish S <satish@google.com>
- Date: Thu, 5 Jan 2012 11:49:10 +0000
- To: Peter Beverloo <peter@chromium.org>
- Cc: Glen Shires <gshires@google.com>, public-webapps@w3.org, public-xg-htmlspeech@w3.org, Arthur Barstow <art.barstow@nokia.com>, Dan Burnett <dburnett@voxeo.com>
- Message-ID: <CAHZf7R=hQsqmE5QQbkc8W6qd3Wr0VRZ2J64daBTnhcWE7H=nbw@mail.gmail.com>
> > 2) How does the draft incorporate with the existing <input speech> > API[1]? It seems to me as if it'd be best to define both the attribute > as the DOM APIs in a single specification, also because they share > several events (yet don't seem to be interchangeable) and the > attribute already has an implementation. > The <input speech> API proposal was implemented as <input x-webkit-speech> in Chromium a while ago. A lot of the developer feedback we received was about finer grained control including a javascript API and letting the web application decide how to present the user interface rather than tying it to the <input> element. The HTML Speech Incubator Group's final report [1] includes a <reco> element which addresses both these concerns and provides automatic binding of speech recognition results to existing HTML elements. We are not sure if the WebApps WG is a good place to work on standardising such markup elements, hence did not include in the simplified Javascript API [2]. If there is sufficient interest and scope in the WebApps WG charter for the Javascript API and markup, we are happy to combine them both in the proposal. [1] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ [2] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html > > Thanks, > Peter > > [1] > http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html > > On Thu, Jan 5, 2012 at 07:15, Glen Shires <gshires@google.com> wrote: > > As Dan Burnett wrote below: The HTML Speech Incubator Group [1] has > recently > > wrapped up its work on use cases, requirements, and proposals for adding > > automatic speech recognition (ASR) and text-to-speech (TTS) capabilities > to > > HTML. The work of the group is documented in the group's Final Report. > [2] > > The members of the group intend this work to be input to one or more > > working groups, in W3C and/or other standards development organizations > such > > as the IETF, as an aid to developing full standards in this space. > > > > Because that work was so broad, Art Barstow asked (below) for > a relatively > > specific proposal. We at Google are proposing that a subset of it be > > accepted as a work item by the Web Applications WG. Specifically, we are > > proposing this Javascript API [3], which enables web developers to > > incorporate speech recognition and synthesis into their web pages. > > This simplified subset enables developers to use scripting to generate > > text-to-speech output and to use speech recognition as an input for > forms, > > continuous dictation and control, and it supports the majority of > use-cases > > in the Incubator Group's Final Report. > > > > We welcome your feedback and ask that the Web Applications WG > > consider accepting this Javascript API [3] as a work item. > > > > [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter > > [2] report: http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ > > [3] > > API: > http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html > > > > Bjorn Bringert > > Satish Sampath > > Glen Shires > > > > On Thu, Dec 22, 2011 at 11:38 AM, Glen Shires <gshires@google.com> > wrote: > >> > >> Milan, > >> The IDLs contained in both documents are in the same format and order, > so > >> it's relatively easy to compare the two side-by-side. The semantics of > the > >> attributes, methods and events have not changed, and both IDLs link > directly > >> to the definitions contained in the Speech XG Final Report. > >> > >> As you mention, we agree that the protocol portions of the Speech XG > Final > >> Report are most appropriate for consideration by a group such as IETF, > and > >> believe such work can proceed independently, particularly because the > Speech > >> XG Final Report has provided a roadmap for these to remain compatible. > >> Also, as shown in the Speech XG Final Report - Overview, the "Speech > Web > >> API" is not dependent on the "Speech Protocol" and a "Default Speech" > >> service can be used for local or remote speech recognition and > synthesis. > >> > >> Glen Shires > >> > >> > >> On Thu, Dec 22, 2011 at 10:32 AM, Young, Milan <Milan.Young@nuance.com> > >> wrote: > >>> > >>> Hello Glen, > >>> > >>> > >>> > >>> The proposal says that it contains a “simplified subset of the > JavaScript > >>> API”. Could you please clarify which elements of the HTMLSpeech > >>> recommendation’s JavaScript API were omitted? I think this would be > the > >>> most efficient way for those of us familiar with the XG recommendation > to > >>> evaluate the new proposal. > >>> > >>> > >>> > >>> I’d also appreciate clarification on how you see the protocol being > >>> handled. In the HTMLSpeech group we were thinking about this as a > >>> hand-in-hand relationship between W3C and IETF like WebSockets. Is > this > >>> still your (and Google’s) vision? > >>> > >>> > >>> > >>> Thanks > >>> > >>> > >>> > >>> > >>> > >>> From: Glen Shires [mailto:gshires@google.com] > >>> Sent: Thursday, December 22, 2011 11:14 AM > >>> To: public-webapps@w3.org; Arthur Barstow > >>> Cc: public-xg-htmlspeech@w3.org; Dan Burnett > >>> > >>> > >>> Subject: Re: HTML Speech XG Completes, seeks feedback for eventual > >>> standardization > >>> > >>> > >>> > >>> We at Google believe that a scripting-only (Javascript) subset of the > API > >>> defined in the Speech XG Incubator Group Final Report is of appropriate > >>> scope for consideration by the WebApps WG. > >>> > >>> > >>> > >>> The enclosed scripting-only subset supports the majority of the > use-cases > >>> and samples in the XG proposal. Specifically, it enables web-pages to > >>> generate speech output and to use speech recognition as an input for > forms, > >>> continuous dictation and control. The Javascript API will allow web > pages to > >>> control activation and timing and to handle results and alternatives. > >>> > >>> > >>> > >>> We welcome your feedback and ask that the Web Applications WG consider > >>> accepting this as a work item. > >>> > >>> > >>> > >>> Bjorn Bringert > >>> > >>> Satish Sampath > >>> > >>> Glen Shires > >>> > >>> > >>> > >>> On Tue, Dec 13, 2011 at 11:39 AM, Glen Shires <gshires@google.com> > wrote: > >>> > >>> We at Google believe that a scripting-only (Javascript) subset of the > API > >>> defined in the Speech XG Incubator Group Final Report [1] is of > appropriate > >>> scope for consideration by the WebApps WG. > >>> > >>> > >>> > >>> A scripting-only subset supports the majority of the use-cases and > >>> samples in the XG proposal. Specifically, it enables web-pages to > generate > >>> speech output and to use speech recognition as an input for forms, > >>> continuous dictation and control. The Javascript API will allow web > pages to > >>> control activation and timing and to handle results and alternatives > >>> > >>> > >>> > >>> As Dan points out above, we envision that different portions of the > >>> Incubator Group Final Report are applicable to different working > groups "in > >>> W3C and/or other standards development organizations such as the IETF". > >>> This scripting API subset does not preclude other groups from pursuing > >>> standardization of relevant HTML markup or underlying transport > protocols, > >>> and indeed the Incubator Group Final Report defines a potential > roadmap such > >>> that such additions can be compatible. > >>> > >>> > >>> > >>> To make this more concrete, Google will provide to this mailing list a > >>> specific proposal extracted from the Incubator Group Final Report, that > >>> includes only those portions we believe are relevant to WebApps, with > links > >>> back to the Incubator Report as appropriate. > >>> > >>> > >>> > >>> Bjorn Bringert > >>> > >>> Satish Sampath > >>> > >>> Glen Shires > >>> > >>> > >>> > >>> [1] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ > >>> > >>> > >>> > >>> On Tue, Dec 13, 2011 at 5:32 AM, Dan Burnett <dburnett@voxeo.com> > wrote: > >>> > >>> Thanks for the info, Art. To be clear, I personally am *NOT* proposing > >>> adding any specs to WebApps, although others might. My email below as > a > >>> Chair of the group is merely to inform people of this work and ask for > >>> feedback. > >>> I expect that your information will be useful for others who might wish > >>> for some of this work to continue in WebApps. > >>> > >>> -- dan > >>> > >>> > >>> > >>> On Dec 13, 2011, at 7:06 AM, Arthur Barstow wrote: > >>> > >>> > Hi Dan, > >>> > > >>> > WebApps already has a relatively large number of specs in progress > (see > >>> > [PubStatus]) and the group has agreed to add some additional specs > (see > >>> > [CharterChanges]). As such, please provide a relatively specific > proposal > >>> > about the features/specs you and other proponents would like to add > to > >>> > WebApps. > >>> > > >>> > Regarding the level of detail for your proposal, I think a reasonable > >>> > precedence is something like the Gamepad and Pointer/MouseLock > proposals > >>> > (see [CharterChanges]). (Perhaps this could be achieved by > identifying > >>> > specific sections in the XG's Final Report?) > >>> > > >>> > -Art Barstow > >>> > > >>> > [PubStatus] > >>> > http://www.w3.org/2008/webapps/wiki/PubStatus#API_Specifications > >>> > [CharterChanges] > >>> > http://www.w3.org/2008/webapps/wiki/CharterChanges#Additions_Agreed > >>> > > >>> > On 12/12/11 5:25 PM, ext Dan Burnett wrote: > >>> >> Dear WebApps people, > >>> >> > >>> >> The HTML Speech Incubator Group [1] has recently wrapped up its work > >>> >> on use cases, requirements, and proposals for adding automatic > speech > >>> >> recognition (ASR) and text-to-speech (TTS) capabilities to HTML. > The work > >>> >> of the group is documented in the group's Final Report. [2] > >>> >> > >>> >> The members of the group intend this work to be input to one or more > >>> >> working groups, in W3C and/or other standards development > organizations such > >>> >> as the IETF, as an aid to developing full standards in this space. > >>> >> Whether the W3C work happens in a new Working Group or an existing > >>> >> one, we are interested in collecting feedback on the Incubator > Group's work. > >>> >> We are specifically interested in input from the members of the > WebApps > >>> >> Working Group. > >>> >> > >>> >> If you have any feedback to share, please send it to, or cc, the > >>> >> group's mailing list (public-xg-htmlspeech@w3.org). This will > allow > >>> >> comments to be archived in one consistent location for use by > whatever group > >>> >> takes up this work. > >>> >> > >>> >> > >>> >> Dan Burnett, Co-Chair > >>> >> HTML Speech Incubator Group > >>> >> > >>> >> > >>> >> [1] charter: http://www.w3.org/2005/Incubator/htmlspeech/charter > >>> >> [2] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/ > >>> >> > >>> >> p.s. This feedback request is being sent to the following groups: > >>> >> WebApps, HTML, Audio, DAP, Voice Browser, Multimodal Interaction > >>> > >>> > >>> > >>> > >> > >> > > > >
Received on Thursday, 5 January 2012 11:49:43 UTC