Re: [HTML Speech] speech resource specification requirement from Chaitanya Gharpure on 2010-09-10 (public-xg-htmlspeech@w3.org from September 2010)

From: Chaitanya Gharpure <chaitanyag@google.com>
Date: Thu, 9 Sep 2010 22:02:46 -0700
To: Dave Burke <daveburke@google.com>
Cc: Robert Brown <Robert.Brown@microsoft.com>, "JOHNSTON, MICHAEL J (MICHAEL J)" <johnston@research.att.com>, "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
Message-ID: <AANLkTi=SQ234DWSTQRwgcUtdwreyucad14GR4ij2vdjF@mail.gmail.com>

While we are talking about flexibility of having multiple speech engines,
I'd like to bring up a point about pluggability of text-to-speech engines.
It should be possible to specify a target TTS engine not only via the "URI"
attribute, but via a more generic "source" attribute, which can point to a
local TTS engine as well. To achieve this, it'd be useful to think about
extendability and flexibility of the framework, so that it is easy for third
parties to provide high quality TTS engines.

On Thu, Sep 9, 2010 at 3:50 PM, Dave Burke <daveburke@google.com> wrote:

> If we are successful, the overwhelming majority of application developers
> will not have their own speech engines. Moreover, most won't know much about
> speech technology and won't be building their own language models... They'll
> just want something that "works" and is easy to develop against.
>
> Dave
>
> On Thu, Sep 9, 2010 at 11:40 PM, Robert Brown <Robert.Brown@microsoft.com>wrote:
>
>>  I think the argument makes sense for geo-location.  But that may not be
>> the right analogy here.
>>
>>
>>
>> There are considerable differences between speech engine capabilities.
>> Even those engines that purport to model the same scenarios produce
>> different results.  Not all engines implement all scenarios, new
>> capabilities are being invented all the time, and applications are tuned to
>> the nuances of specific engines.  In addition, some applications will rely
>> on large purpose-built models that are too large to load on demand to
>> arbitrary locations across the Internet, and some may contain IP that the
>> owner wants kept private, with the result in either case being that some
>> apps are tightly coupled to a specific speech service provider.
>>
>>
>>
>> To me, this is an argument for application-selection of services rather
>> than user-selection.  I agree, security and privacy are both concerns.  But
>> they don’t negate the requirement, just add to it.
>>
>>
>>
>> Cheers,
>>
>>
>>
>> /Rob
>>
>>
>>
>> *From:* public-xg-htmlspeech-request@w3.org [mailto:
>> public-xg-htmlspeech-request@w3.org] *On Behalf Of *Dave Burke
>> *Sent:* Thursday, September 09, 2010 3:10 PM
>> *To:* JOHNSTON, MICHAEL J (MICHAEL J)
>>
>> *Cc:* public-xg-htmlspeech@w3.org
>> *Subject:* Re: [HTML Speech] speech resource specification requirement
>>
>>
>>
>> I think selection of the speech engine should be a user-setting in the
>> browser, not a Web developer setting. We had a similar conversation in
>> Geolocation where some folks wanted a URI to point to a specific location
>> server. We rightly removed it. Also, the security bar is much higher for an
>> audio recording solution that can be pointed at an arbitrary destination for
>> obvious reasons.
>>
>>
>>
>> Dave
>>
>>
>>
>> On Wed, Sep 8, 2010 at 8:50 PM, JOHNSTON, MICHAEL J (MICHAEL J) <
>> johnston@research.att.com> wrote:
>>
>>
>> Here is one of the specific requirements we have for adding speech to
>> HTML:
>>
>> Requirement:
>>
>> The HTML+Speech standard must allow specification of the speech resource
>> (e.g. speech recognizer) to be used for processing of the audio
>> collected from the user. For example, this could be specified
>> as URI valued attribute on the element supporting speech recognition.
>> When audio is captured from the user it will then be streamed over http
>> to the specified URI.
>>
>> best
>> Michael
>>
>>
>>
>> >
>> > =======================================
>> > REQUIREMENTS, USE CASES, and PROPOSALS
>> > =======================================
>> > I think the best way to begin is to ask right up front for the items we
>> are interested in:  requirements, use cases, and proposals for changes to
>> HTML.
>> >
>> > If you have requirements, use cases, or proposals for changes to HTML,
>> please send them in to this list.  When the trickle slows we'll look at what
>> we have and decide on next steps.  For expediency, please plan to send in
>> any such materials by Monday, September 13.
>>
>>
>>
>>
>
>

Received on Friday, 10 September 2010 07:19:49 UTC