Re: Speech API Community Group from Glen Shires on 2012-04-04 (public-xg-htmlspeech@w3.org from April 2012)

From: Glen Shires <gshires@google.com>
Date: Wed, 4 Apr 2012 09:01:33 -0700
To: "Young, Milan" <Milan.Young@nuance.com>, public-speech-api-contrib@w3.org
Cc: Charles Pritchard <chuck@jumis.com>, Michael Bodell <mbodell@microsoft.com>, Jerry Carter <jerry@jerrycarter.org>, "Raj (Openstream)" <raj@openstream.com>, Jim <Jim@haynes-barnett.net>, "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>, "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <CAEE5bcjycOLuZpjhm=gGrQKz236+b=D0qfKNLMpYLQ62gAYCuA@mail.gmail.com>
One of the key goals of this Speech API CG is to expedite the
implementation of speech in browsers. As Olli outlined [1], web APIs tend
to evolve, beginning with a first iteration specification and adding
additional features later.

I believe that a web-author-facing JavaScript API for specifying network
speech services (FPR7 and FPR12) are within the scope of this CG, and can
be included in the first iteration of this specification, so long as they
do not delay completing it. For example, a serviceUri [2] method can
provide a JavaScript API for specifying such resources. Parameters in that
Uri string could be used for resource selection.

As Jerry wrote [3] the challenge is "to define what characteristics may be
specified for resource selection or, alternatively, to determine that such
definition is external to the immediate API: for instance, there might be a
separate spec which is referenced by the Speech JavaScript API."  In any
case, a comprehensive solution, including security and privacy, is complex,
so this CG should
start with a realistic goal for the first specification and iterate in
future specifications.

/Glen Shires

FPR7. Web apps should be able to request speech service different from
default.
FPR12. Speech services that can be specified by web apps must include
network speech services.

[1]
http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2012Jan/0009.html
[2]
http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#dfn-uri
[3]
http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2012Apr/0008.html


On Tue, Apr 3, 2012 at 2:08 PM, Young, Milan <Milan.Young@nuance.com> wrote:

>  The problem is that the community group has an ambiguous “charter”, and
> at least some folks would like this clarified before joining.  Being that
> Speech-XG and Webapps are the two most relevant lists, I don’t know where
> else the discussion would take place.****
>
> ** **
>
> I believe that all of this could be cleared up by a simple statement from
> the CG chair (Glen Shires) that FPR7 and FPR12 are in scope.  There is
> “STRONG” interest in this domain and an editor has already volunteered (Jim
> Barnett).  Seems like a simple decision.****
>
> ** **
>
> Thanks****
>
> ** **
>
> ** **
>
> *From:* Charles Pritchard [mailto:chuck@jumis.com]
> *Sent:* Tuesday, April 03, 2012 1:29 PM
> *To:* Michael Bodell
> *Cc:* Jerry Carter; Raj (Openstream); Young, Milan; Jim; Glen Shires;
> public-xg-htmlspeech@w3.org; public-webapps@w3.org
>
> *Subject:* Re: Speech API Community Group****
>
>  ** **
>
> I'd like to encourage everyone interested in the Speech API to join the
> mailing list:
> http://lists.w3.org/Archives/Public/public-speech-api/
>
> For those interested in more hands-on interaction, there's the CG:
> http://www.w3.org/community/speech-api/
>
> For some archived mailing list discussion, browse the old XG list:
> http://lists.w3.org/Archives/Public/public-xg-htmlspeech/
>
> It seems like we can move this chatter over to public-speech-api and off
> of the webapps list.
>
> -Charles
>
>
> On 4/3/2012 1:08 PM, Michael Bodell wrote: ****
>
> A little bit of historical context and resource references might be
> helpful for some on the email thread.****
>
>  ****
>
> While this is still an early stage for a community group, if one will
> happen, it actually isn’t early for the community as a group to talk about
> this.  In many ways we’ve already done the initial incubation and community
> discussion and investigation for this space in the HTML Speech XG.  This
> lead to the XG’s use case and requirements document:****
>
> http://www.w3.org/2005/Incubator/htmlspeech/live/requirements.html****
>
>  ****
>
> which were then refined to a prioritized requirement list after soliciting
> community input:****
>
>
> http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/#prioritized
> ****
>
>  ****
>
> As I read it, Milan and Jim and Raj’s requirements discussed are part of
> FPR7 [Web apps should be able to request speech service different from
> default] and FPR12 [Speech services that can be specified by web apps must
> include network speech services], both of which were voted to have “Strong
> Interest” by the community.****
>
>  ****
>
> Further work from these requirements led to the community coming up with a
> proposal, which is ready now to be taken to a standards track process, that
> was published in the XG final report:****
>
> http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/****
>
>  ****
>
> Hopefully we can all properly leverage the work the community has already
> done.****
>
>  ****
>
> Michael Bodell****
>
> Co-chair HTML Speech XG****
>
>  ****
>
>  ****
>
> *From:* Jerry Carter [mailto:jerry@jerrycarter.org <jerry@jerrycarter.org>]
>
> *Sent:* Tuesday, April 03, 2012 12:50 PM
> *To:* Raj (Openstream); Milan Young; Jim
> *Cc:* Glen Shires; public-xg-htmlspeech@w3.org; public-webapps@w3.org
> *Subject:* Re: Speech API Community Group****
>
>  ****
>
>  ****
>
> We can discuss this in terms of generalities without any resolution, so
> let me offer two more concrete use cases:****
>
>  ****
>
>  My friend Jóse is working on a personal site to track teams and player
> statistics at the Brazil 2014 World Cup.  He recognizes that the browser
> will define a default language through the HTTP Accept-Language header, but
> knows that speakers may code switch in their requests (e.g. Spanish +
> English or Portuguese + English or ) or be better served by using native
> pronunciations (Jesus = /heːzus/ vs. /ˈdʒiːzəs/).  Hence, he requires a
> resource that can provide support for Spanish, English, and Portuguese and
> that can also support multiple simultaneous languages.****
>
>    ****
>
> These are two solid requirements.  A browser encountering the page might
> (1) be able to satisfy these requirements, (2) require user permission
> before accessing such a resource, or (3) be unable to meet the request.***
> *
>
>  ****
>
>  My colleague Jim has another application for which hundreds of hours
> have been invested to optimize the performance for a specify recognition
> resource.  Security considerations further restrict the physical location
> of conforming resources.  His page requires a very specific resource.****
>
>  ****
>
> These are two solid requirements.  A browser encountering the page might
> (1) be able to satisfy these requirements, (2) require user permission
> before accessing such a resource, or (3) be unable to meet the request.***
> *
>
>  ****
>
> There are indeed commercial requirements around the capabilities of
> resources.  We are in full agreement.  It is important to be able to list
> requirements for conforming resources and to ensure that the browser is
> enforcing those requirements.  That stated, the application author does no
> care where such a conforming resource resides so long as it is available to
> the targeted user population.  The user does not care where the resource
> resides so long as it works well and does not cost too much to use.****
>
>  ****
>
> The trick within a Speech JavaScript API is to define what characteristics
> may be specified for resource selection or, alternatively, to determine
> that such definition is external to the immediate API: for instance,  there
> might be a separate spec which is referenced by the Speech JavaScript API.
>  It is too early to tell what direction the group might go.  It is already
> clear that there are strong opinions as to what criteria may be necessary
> for resource selection.  *Refusing to participate unless one's specific
> criteria are addressed strikes me as quite inappropriate at this early
> stage.*****
>
>  ****
>
> -=- Jerry****
>
>  ****
>
>  ****
>
>  ****
>
> On Apr 3, 2012, at 3:15 PM, Raj (Openstream) wrote:****
>
>
>
>
> ****
>
>
> Perhaps true for users of the applicaitons. But, Authors would need
> Resource-specification(location),
> hence clearly specifying how network/local services can be used ( even if
> protocols are out of scope)
> , outside of browser-defaults will be of interest to many including
> Openstream.
>
> Raj
>
>
>
> On Tue, 3 Apr 2012 14:45:45 -0400
> Jerry Carter <jerry@jerrycarter.org> wrote:
>
>
> ****
>
> On Apr 3, 2012, at 11:48 AM, Young, Milan wrote:****
>
>  The proposal mentions that the specification of a network speech
> protocol is out of scope. This makes sense given that protocols are the
> domain of the IETF.****
>
>   But I’d like to confirm that the use of network speech services are in
> scope for this CG.  Would you mind amending the proposal to make this
> explicit?****
>
>  I don't see why any such declaration is necessary.  From the perspective
> of the application author or of the application user, it matters very
> little where the speech-to-text operation occurs so long as the result is
> delivered promptly.  There is no reason that local, network-based, or
> hybrid solutions would be unable to provide adequate performance.  I
> believe the current language in the proposal is appropriate.****
>
>  -=- Jerry****
>
>
> --
> NOTICE TO RECIPIENT:  THIS E-MAIL IS  MEANT FOR ONLY THE INTENDED
> RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGED BY
> LAW.  IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION,
> DISTRIBUTION, OR COPYING OF THIS E-MAIL IS STRICTLY PROHIBITED.  PLEASE
> NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND PLEASE DELETE THIS
> MESSAGE FROM YOUR SYSTEM. THANK YOU IN ADVANCE FOR YOUR COOPERATION. Reply
> to : legal@openstream.com
>
>
> ****
>
>  ****
>
> ** **
>



-- 
Thanks!
Glen Shires
Received on Wednesday, 4 April 2012 16:02:52 UTC