Re: Aborting speech recognition when web page looses focus from Satish S on 2012-08-16 (public-speech-api@w3.org from August 2012)

From: Satish S <satish@google.com>
Date: Thu, 16 Aug 2012 16:47:59 +0100
To: olli@pettay.fi
Cc: Hans Wennborg <hwennborg@google.com>, public-speech-api@w3.org
Message-ID: <CAHZf7R=0Ve3zig6L56GmqriRQ1QxnXf8GRPXcEZ+hk8XpTaTtw@mail.gmail.com>

Keeping it more strict means the speech reco API can't enable use cases
such as recorders where you take notes while doing some activity.. e.g.
reading a thesis and jotting down points in a web based recorder app. The
idea that (2) can be hard to implement is a UA design issue and there can
be UAs where this is built into the device (e.g. an LED indicator when
recording like we have in laptop webcams). I don't think we should mandate
aborting speech recognition when losingn focus for these reasons.

In principle recording audio for speech input should share the same privacy
concerns as getUserMedia in webrtc which covers recording audio and video.

Cheers
Satish


On Thu, Aug 16, 2012 at 4:32 PM, Olli Pettay <Olli.Pettay@helsinki.fi>wrote:

> On 08/16/2012 05:57 PM, Hans Wennborg wrote:
>
>> Hi all,
>>
>> The current spec draft lists four security and privacy considerations
>> [1], summarized as:
>>
>> 1. The UA must ask for explicit informed user consent before starting
>> any recording
>> 2. The UA must clearly indicate when it's recording
>> 3. The UA may give a longer explanation the first time speech
>> recognition is used
>> 4. The UA must abort any active speech input session if focus moves
>> away from the web page.
>>
>> Points one and two seem to me to be the critical points to ensure the
>> user's privacy. They also line up nicely with the requirements for
>> accessing a user's microphone or webcam through the GetUserMedia API
>> [2].
>>
>> I propose that we remove the last point. I think it unnecessarily
>> reduces the usefulness of the speech recognition API. For example, a
>> user wouldn't be allowed to use a speech-enabled application in one
>> window, and at the same time interact with another window next to it.
>>
>> What do you think?
>>
>
>
> Well, if 2. can be implemented so that it is always clear which site is
> recording
> user's speech, then 4. might not be too important.
> But doing 2. in such strict way can be very hard from the UI perspective.
>
> So, as for v1 API, it would be better to be more strict and keep 4.
>
> (This is somewhat similar case as pointerlock, which is now enabled only
> in fullscreen, and
> browser vendors are trying to figure out safe way to enable it in
> non-fullscreen mode too.
> IIRC Chrome has some pref to enable it for non-fullscreen, but I'm not
> convinced the approach is safe enough yet.)
>
>
>
> -Olli
>
>
>
>
>
>> Thanks,
>> Hans
>>
>> [1]. http://dvcs.w3.org/hg/speech-**api/raw-file/tip/speechapi.**
>> html#security<http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html#security>
>> [2]. http://dev.w3.org/2011/webrtc/**editor/getusermedia.html<http://dev.w3.org/2011/webrtc/editor/getusermedia.html>
>>
>>
>
>

Received on Thursday, 16 August 2012 15:48:29 UTC