About speech request initiation and reco element etc from Olli Pettay on 2011-07-04 (public-xg-htmlspeech@w3.org from July 2011)

From: Olli Pettay <Olli.Pettay@helsinki.fi>
Date: Mon, 04 Jul 2011 13:40:39 +0300
To: public-xg-htmlspeech@w3.org
CC: Bjorn Bringert <bringert@google.com>
Message-ID: <4E1198A7.20708@helsinki.fi>

Hi all,

(I started to write this when I thought I could have some reasonable
compromise between the privacy issues and the usability that Google
wants. But I ended up into just more issues :/ But I'm sending this
anyway.)

so far it hasn't become clear to me why we need <reco> element,
or special UI in <input> (like in current Chrome).
Because of click-jacking problem, the speech UI doesn't give us any
better security or privacy handling than using pure scripting.
Also, I'm pretty sure web devs want to be able have their own UI anyway.

So, for most cases Speech.getRequest()/getRequestFor() approach should 
work just fine.
The problematic case is the Google Translate example.
(IMHO, it should ask permission from user before enabling
speech UI, similar to Google Maps. How is for example gender
recognition less privacy related than location?)

But, perhaps forthe  default speech service, or other speech services
which user *has* somehow *granted* permissions, permission management
could be more flexible. What if, while handling user interaction - say
trusted click event - implementation could immediately call the
successcallback passed to Speech.getRequest(). Implementation should
still show the UI that recognition is on, and the UI should have some
way to abort the recognition without giving any data to the web page.
Also, if the user is concerned about the privacy, (s)he would never
grant any automatic permissions to speech services, and would have
to always give the permission when a page first time after (re-)loading
tries to use speech services.
Effectively in Chrome case this might mean that at some point the
browser would ask permission to use the default speech service, and
after that any click on a web page could start recognition.

Hmmm... this is still pretty scary. And even wrong. We're dealing with
several different permissions. At least a) is it ok to send user's
speech data to service X, b) is it ok that web app Y uses speech
services, c) is it ok that web app Y uses service X.


a) allows service X to do at least gender recognition, so there is a
clear privacy data leak to X.

b) is close to the issues related to current implementation in Chrome.
Is it ok that whenever user clicks something in a page (any web page!),
the page may get some recognition results.

c) if I need to give my social security number to web site Y, is
it ok to use speech service X to recognize the number.
Usually it may be ok to the user to give some data to service X, but
perhaps ssn is not such data.


...so, my trial to come up with a solution for privacy handling which
would be ok to Google hasn't yet succeeded.


(It is not quite clear to me why the privacy handling of capturing API
or Geolocation API is ok to Google, but for speech handling something
else is needed.)


-Olli

Received on Monday, 4 July 2011 10:41:17 UTC