Re: Aborting speech recognition when web page looses focus from Olli Pettay on 2012-08-16 (public-speech-api@w3.org from August 2012)

From: Olli Pettay <Olli.Pettay@helsinki.fi>
Date: Thu, 16 Aug 2012 18:53:29 +0300
To: Satish S <satish@google.com>
CC: Hans Wennborg <hwennborg@google.com>, public-speech-api@w3.org
Message-ID: <502D1779.2040205@helsinki.fi>

On 08/16/2012 06:47 PM, Satish S wrote:
> Keeping it more strict means the speech reco API can't enable use cases such as recorders where you take notes while doing some activity.. e.g.
> reading a thesis and jotting down points in a web based recorder app. The idea that (2) can be hard to implement is a UA design issue and there can be
> UAs where this is built into the device (e.g. an LED indicator when recording like we have in laptop webcams). I don't think we should mandate
> aborting speech recognition when losingn focus for these reasons.
>
> In principle recording audio for speech input should share the same privacy concerns as getUserMedia in webrtc which covers recording audio and video.



That is true, there should be same handling as with WebRTC.
I'm not sure security and privacy has been handled yet in WebRTC.
(the API hasn't been reviewed too well, AFAIK.)

>
> Cheers
> Satish
>
>
> On Thu, Aug 16, 2012 at 4:32 PM, Olli Pettay <Olli.Pettay@helsinki.fi <mailto:Olli.Pettay@helsinki.fi>> wrote:
>
>     On 08/16/2012 05:57 PM, Hans Wennborg wrote:
>
>         Hi all,
>
>         The current spec draft lists four security and privacy considerations
>         [1], summarized as:
>
>         1. The UA must ask for explicit informed user consent before starting
>         any recording
>         2. The UA must clearly indicate when it's recording
>         3. The UA may give a longer explanation the first time speech
>         recognition is used
>         4. The UA must abort any active speech input session if focus moves
>         away from the web page.
>
>         Points one and two seem to me to be the critical points to ensure the
>         user's privacy. They also line up nicely with the requirements for
>         accessing a user's microphone or webcam through the GetUserMedia API
>         [2].
>
>         I propose that we remove the last point. I think it unnecessarily
>         reduces the usefulness of the speech recognition API. For example, a
>         user wouldn't be allowed to use a speech-enabled application in one
>         window, and at the same time interact with another window next to it.
>
>         What do you think?
>
>
>
>     Well, if 2. can be implemented so that it is always clear which site is recording
>     user's speech, then 4. might not be too important.
>     But doing 2. in such strict way can be very hard from the UI perspective.
>
>     So, as for v1 API, it would be better to be more strict and keep 4.
>
>     (This is somewhat similar case as pointerlock, which is now enabled only in fullscreen, and
>     browser vendors are trying to figure out safe way to enable it in non-fullscreen mode too.
>     IIRC Chrome has some pref to enable it for non-fullscreen, but I'm not convinced the approach is safe enough yet.)
>
>
>
>     -Olli
>
>
>
>
>
>         Thanks,
>         Hans
>
>         [1]. http://dvcs.w3.org/hg/speech-__api/raw-file/tip/speechapi.__html#security
>         <http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html#security>
>         [2]. http://dev.w3.org/2011/webrtc/__editor/getusermedia.html <http://dev.w3.org/2011/webrtc/editor/getusermedia.html>
>
>
>
>

Received on Thursday, 16 August 2012 15:53:59 UTC