Re: [HTML Speech] speech resource specification requirement from Satish Sampath on 2010-09-09 (public-xg-htmlspeech@w3.org from September 2010)

From: Satish Sampath <satish@google.com>
Date: Thu, 9 Sep 2010 10:53:05 +0100
To: "JOHNSTON, MICHAEL J (MICHAEL J)" <johnston@research.att.com>
Cc: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
Message-ID: <AANLkTin=ydqh0=SVbTTo1QN6WFbt1zi6F_ugHXvF0mW_@mail.gmail.com>

> The HTML+Speech standard must allow specification of the speech resource
> (e.g. speech recognizer) to be used for processing of the audio
> collected from the user. For example, this could be specified
> as URI valued attribute on the element supporting speech recognition.
> When audio is captured from the user it will then be streamed over http
> to the specified URI.

Specifying the speech recognizer would also require standardising the
protocol between the UA and the recognizer. I like how many of the
existing APIs such as Geolocation
(http://dev.w3.org/geo/api/spec-source.html) are agnostic to which
resource/server is used and let the UA make the choice. That keeps the
spec simple and focused on the web developer.

> Web app might want to process the microphone input data
> somehow before pushing it to recognizer.
> https://wiki.mozilla.org/Audio_Data_API
....
> If the speech input can be captured as data by the web page, it
> can stream the data using XMLHttpRequest or WebSockets to server.

These seem more applicable to the <device> specification which allows
capturing arbitrary audio and process/stream it. It also brings up
interesting security/privacy concerns if the recorded audio is given
to the web app, which is again being addressed in the <device>
specification. I think we should look at speech related use cases and
requirements here than general purpose audio manipulation.

Cheers
Satish



On Wed, Sep 8, 2010 at 8:50 PM, JOHNSTON, MICHAEL J (MICHAEL J)
<johnston@research.att.com> wrote:
>
> Here is one of the specific requirements we have for adding speech to HTML:
>
> Requirement:
>
> The HTML+Speech standard must allow specification of the speech resource
> (e.g. speech recognizer) to be used for processing of the audio
> collected from the user. For example, this could be specified
> as URI valued attribute on the element supporting speech recognition.
> When audio is captured from the user it will then be streamed over http
> to the specified URI.
>
> best
> Michael
>
>
>
>>
>> =======================================
>> REQUIREMENTS, USE CASES, and PROPOSALS
>> =======================================
>> I think the best way to begin is to ask right up front for the items we are interested in:  requirements, use cases, and proposals for changes to HTML.
>>
>> If you have requirements, use cases, or proposals for changes to HTML, please send them in to this list.  When the trickle slows we'll look at what we have and decide on next steps.  For expediency, please plan to send in any such materials by Monday, September 13.
>
>
>
>

Received on Thursday, 9 September 2010 09:53:36 UTC