RE: [HTML Speech] speech resource specification requirement from Robert Brown on 2010-09-09 (public-xg-htmlspeech@w3.org from September 2010)

From: Robert Brown <Robert.Brown@microsoft.com>
Date: Thu, 9 Sep 2010 22:40:36 +0000
To: Dave Burke <daveburke@google.com>, "JOHNSTON, MICHAEL J (MICHAEL J)" <johnston@research.att.com>
CC: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
Message-ID: <113BCF28740AF44989BE7D3F84AE18DD15237BB5@TK5EX14MBXC118.redmond.corp.microsoft.>

I think the argument makes sense for geo-location.  But that may not be the right analogy here.

There are considerable differences between speech engine capabilities.  Even those engines that purport to model the same scenarios produce different results.  Not all engines implement all scenarios, new capabilities are being invented all the time, and applications are tuned to the nuances of specific engines.  In addition, some applications will rely on large purpose-built models that are too large to load on demand to arbitrary locations across the Internet, and some may contain IP that the owner wants kept private, with the result in either case being that some apps are tightly coupled to a specific speech service provider.

To me, this is an argument for application-selection of services rather than user-selection.  I agree, security and privacy are both concerns.  But they don't negate the requirement, just add to it.

Cheers,

/Rob

From: public-xg-htmlspeech-request@w3.org [mailto:public-xg-htmlspeech-request@w3.org] On Behalf Of Dave Burke
Sent: Thursday, September 09, 2010 3:10 PM
To: JOHNSTON, MICHAEL J (MICHAEL J)
Cc: public-xg-htmlspeech@w3.org
Subject: Re: [HTML Speech] speech resource specification requirement

I think selection of the speech engine should be a user-setting in the browser, not a Web developer setting. We had a similar conversation in Geolocation where some folks wanted a URI to point to a specific location server. We rightly removed it. Also, the security bar is much higher for an audio recording solution that can be pointed at an arbitrary destination for obvious reasons.

Dave

On Wed, Sep 8, 2010 at 8:50 PM, JOHNSTON, MICHAEL J (MICHAEL J) <johnston@research.att.com<mailto:johnston@research.att.com>> wrote:

Here is one of the specific requirements we have for adding speech to HTML:

Requirement:

The HTML+Speech standard must allow specification of the speech resource
(e.g. speech recognizer) to be used for processing of the audio
collected from the user. For example, this could be specified
as URI valued attribute on the element supporting speech recognition.
When audio is captured from the user it will then be streamed over http
to the specified URI.

best
Michael

>
> =======================================
> REQUIREMENTS, USE CASES, and PROPOSALS
> =======================================
> I think the best way to begin is to ask right up front for the items we are interested in:  requirements, use cases, and proposals for changes to HTML.
>
> If you have requirements, use cases, or proposals for changes to HTML, please send them in to this list.  When the trickle slows we'll look at what we have and decide on next steps.  For expediency, please plan to send in any such materials by Monday, September 13.

Received on Friday, 10 September 2010 07:19:49 UTC