- From: Satish Sampath <satish@google.com>
- Date: Thu, 9 Sep 2010 10:53:05 +0100
- To: "JOHNSTON, MICHAEL J (MICHAEL J)" <johnston@research.att.com>
- Cc: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
> The HTML+Speech standard must allow specification of the speech resource > (e.g. speech recognizer) to be used for processing of the audio > collected from the user. For example, this could be specified > as URI valued attribute on the element supporting speech recognition. > When audio is captured from the user it will then be streamed over http > to the specified URI. Specifying the speech recognizer would also require standardising the protocol between the UA and the recognizer. I like how many of the existing APIs such as Geolocation (http://dev.w3.org/geo/api/spec-source.html) are agnostic to which resource/server is used and let the UA make the choice. That keeps the spec simple and focused on the web developer. > Web app might want to process the microphone input data > somehow before pushing it to recognizer. > https://wiki.mozilla.org/Audio_Data_API .... > If the speech input can be captured as data by the web page, it > can stream the data using XMLHttpRequest or WebSockets to server. These seem more applicable to the <device> specification which allows capturing arbitrary audio and process/stream it. It also brings up interesting security/privacy concerns if the recorded audio is given to the web app, which is again being addressed in the <device> specification. I think we should look at speech related use cases and requirements here than general purpose audio manipulation. Cheers Satish On Wed, Sep 8, 2010 at 8:50 PM, JOHNSTON, MICHAEL J (MICHAEL J) <johnston@research.att.com> wrote: > > Here is one of the specific requirements we have for adding speech to HTML: > > Requirement: > > The HTML+Speech standard must allow specification of the speech resource > (e.g. speech recognizer) to be used for processing of the audio > collected from the user. For example, this could be specified > as URI valued attribute on the element supporting speech recognition. > When audio is captured from the user it will then be streamed over http > to the specified URI. > > best > Michael > > > >> >> ======================================= >> REQUIREMENTS, USE CASES, and PROPOSALS >> ======================================= >> I think the best way to begin is to ask right up front for the items we are interested in: requirements, use cases, and proposals for changes to HTML. >> >> If you have requirements, use cases, or proposals for changes to HTML, please send them in to this list. When the trickle slows we'll look at what we have and decide on next steps. For expediency, please plan to send in any such materials by Monday, September 13. > > > >
Received on Thursday, 9 September 2010 09:53:36 UTC