RE: [HTML Speech] speech resource specification requirement

>> If we are successful, the overwhelming majority of application developers will not have their own speech engines.

True, most developers won't host their own speech technology.  But they will use technology hosted by a vendor who they select based on capabilities, service level, price, etc.  Some vendors may offer free lowest-common-denominator services.  But many apps will want something more specific, reliable, customizable, etc.

>> Moreover, most won't know much about speech technology and won't be building their own language models... They'll just want something that "works" and is easy to develop against.

While I'd like to be able to believe that, I think the reality is that we're a long way off from even common domains being eventually commoditized to the extent that all service providers are equivalent (and there certainly isn't a spec for how that might work).  And there will always be niches, proprietary information, etc.


From: public-xg-htmlspeech-request@w3.org [mailto:public-xg-htmlspeech-request@w3.org] On Behalf Of Dave Burke
Sent: Thursday, September 09, 2010 3:51 PM
To: Robert Brown
Cc: JOHNSTON, MICHAEL J (MICHAEL J); public-xg-htmlspeech@w3.org
Subject: Re: [HTML Speech] speech resource specification requirement

If we are successful, the overwhelming majority of application developers will not have their own speech engines. Moreover, most won't know much about speech technology and won't be building their own language models... They'll just want something that "works" and is easy to develop against.

Dave

On Thu, Sep 9, 2010 at 11:40 PM, Robert Brown <Robert.Brown@microsoft.com<mailto:Robert.Brown@microsoft.com>> wrote:
I think the argument makes sense for geo-location.  But that may not be the right analogy here.

There are considerable differences between speech engine capabilities.  Even those engines that purport to model the same scenarios produce different results.  Not all engines implement all scenarios, new capabilities are being invented all the time, and applications are tuned to the nuances of specific engines.  In addition, some applications will rely on large purpose-built models that are too large to load on demand to arbitrary locations across the Internet, and some may contain IP that the owner wants kept private, with the result in either case being that some apps are tightly coupled to a specific speech service provider.

To me, this is an argument for application-selection of services rather than user-selection.  I agree, security and privacy are both concerns.  But they don't negate the requirement, just add to it.

Cheers,

/Rob

From: public-xg-htmlspeech-request@w3.org<mailto:public-xg-htmlspeech-request@w3.org> [mailto:public-xg-htmlspeech-request@w3.org<mailto:public-xg-htmlspeech-request@w3.org>] On Behalf Of Dave Burke
Sent: Thursday, September 09, 2010 3:10 PM
To: JOHNSTON, MICHAEL J (MICHAEL J)

Cc: public-xg-htmlspeech@w3.org<mailto:public-xg-htmlspeech@w3.org>
Subject: Re: [HTML Speech] speech resource specification requirement

I think selection of the speech engine should be a user-setting in the browser, not a Web developer setting. We had a similar conversation in Geolocation where some folks wanted a URI to point to a specific location server. We rightly removed it. Also, the security bar is much higher for an audio recording solution that can be pointed at an arbitrary destination for obvious reasons.

Dave

On Wed, Sep 8, 2010 at 8:50 PM, JOHNSTON, MICHAEL J (MICHAEL J) <johnston@research.att.com<mailto:johnston@research.att.com>> wrote:

Here is one of the specific requirements we have for adding speech to HTML:

Requirement:

The HTML+Speech standard must allow specification of the speech resource
(e.g. speech recognizer) to be used for processing of the audio
collected from the user. For example, this could be specified
as URI valued attribute on the element supporting speech recognition.
When audio is captured from the user it will then be streamed over http
to the specified URI.

best
Michael



>
> =======================================
> REQUIREMENTS, USE CASES, and PROPOSALS
> =======================================
> I think the best way to begin is to ask right up front for the items we are interested in:  requirements, use cases, and proposals for changes to HTML.
>
> If you have requirements, use cases, or proposals for changes to HTML, please send them in to this list.  When the trickle slows we'll look at what we have and decide on next steps.  For expediency, please plan to send in any such materials by Monday, September 13.

Received on Friday, 10 September 2010 07:19:49 UTC