- From: Olli Pettay <Olli.Pettay@helsinki.fi>
- Date: Tue, 05 Jul 2011 14:30:38 +0300
- To: Satish S <satish@google.com>
- CC: public-xg-htmlspeech@w3.org, Bjorn Bringert <bringert@google.com>
On 07/05/2011 02:16 PM, Satish S wrote: > Agreed on the "opacity:0" and dropdown menu points, in both these cases > it is the secondary UI (drop down or popup) which acts as a defense > against clickjacking. But it is clear to the user that these actions > happen when they interact with the page as opposed to a set of infobars > popping up as soon as the page loads. infobar doesn't pop up as soon as the page loads. It pops up when the feature is needed. But I agree some kind of context depended, automatically opened menu over the page would be a like a better version of infobar. Both have the same functionality, but the menu could popup close to the place where the permission is required. > > (Another wild idea is that the UA could even have the <reco> element > appear in the window chrome and not within the page.. in this case the > <reco> markup just acts as an indicator to the UA that the page allows > user initiated speech input. anyway, thats just an example of the things > a UA could experiment) > > I see a strong and coherent story with a markup element as it provides > both user initiated & webpage initiated models, So does JS API. allows all JS APIs to > hang of it in a clean fashion (with precedent being HTML5 audio and > other elements) and UAs can use it to come up with more robust and > secure models for such sensitive user information. It does not take > anything away from the JS API but merely adds to it. As I said "Note, I'm not against <reco>, if we can find a reasonable security model when it is used. " So if Google is ok that user needs to give an explicit permission to the page before activating speech recognition, then this one problem is solved :) -Olli > > Cheers > Satish > > > On Tue, Jul 5, 2011 at 11:31 AM, Olli Pettay <Olli.Pettay@helsinki.fi > <mailto:Olli.Pettay@helsinki.fi>> wrote: > > On 07/05/2011 01:26 PM, Olli Pettay wrote: > > On 07/05/2011 01:06 PM, Satish S wrote: > > Hi Olli, > > Here are the reasons I feel we should use a markup element for > recognition: > > 1. Even though click jacking is a problem, the UAs are in > control of > the element's presentation and can implement it in a secure > fashion. The file input dialog tackles this with an > additional popup > window > > The additional popup window has nothing to do with the <input > type=file"> presentation on the page. You can easily just > add style="opacity: 0" and the presentation is hidden. > > > and for speech input UAs may tackle it in different ways. For > example: > * instead of a simple button which starts recording it could > open > a dropdown menu from which the user selects an option (e.g. > "start speaking", "select language", "enable hotkey" and so on). > > This sounds already better, since this requires explicit permission > from user before the recognition is started. But still doesn't > require > any explicit element in the DOM tree. The dropdown menu approach > would work with or without <reco>. > > > * render as a simple button but on top of everything else, so > click jacking is impossible > > This would be very strange. How would you define such button which > doesn't follow the CSS rules, when everything else in the page > is styled based on CSS. Especially when the button is in an iframe, > and the main page paints something over the iframe. > The iframe would have some way to paint over its parent? > > > > > > * a naive implementation could also just bring up an infobar > similar to what the JS API would do. > But the key thing is that UAs can find what interface works best > for them. And for trusted sites (e.g. those which the user or > domain administrator has white listed) it could skip all of the > above and start reco on click. > 2. A markup element allows all the JS APIs to hang of it. > This is > similar to how HTML5 does with the <audio> tag and web sites > that > want to play audio without a UI just create the <audio> tag in > javascript and call methods on it. For speech input if we have a > <reco> element then the recognition JS API could all be > methods of > this element and it presents a consistent picture to developers. > > I can see some, though quite weak, use cases, for example"(un)mute > microphone" for <reco>. Quite often such things are done on OS > level. > And the microphone level could be shown on browser Chrome. > > Note, I'm not against <reco>, if we can find a reasonable security > model when it is used. > Perhaps the dropdown menu could work well enough. > On mobile devices the UI could be different - > push-to-talk approach might work there. > In both cases user would give explicit permission to the > web page to start the recognition. > > > Of course, dropdown menu is effectively just a bit different UI for > the common infobar. > > > > > > > > > > -Olli > > > Cheers > Satish > > On Mon, Jul 4, 2011 at 11:40 AM, Olli Pettay > <Olli.Pettay@helsinki.fi <mailto:Olli.Pettay@helsinki.fi> > <mailto:Olli.Pettay@helsinki.__fi > <mailto:Olli.Pettay@helsinki.fi>>> wrote: > > Hi all, > > (I started to write this when I thought I could have some > reasonable > compromise between the privacy issues and the usability that > Google > wants. But I ended up into just more issues :/ But I'm > sending this > anyway.) > > so far it hasn't become clear to me why we need <reco> element, > or special UI in <input> (like in current Chrome). > Because of click-jacking problem, the speech UI doesn't give > us any > better security or privacy handling than using pure scripting. > Also, I'm pretty sure web devs want to be able have their > own UI anyway. > > So, for most cases Speech.getRequest()/____getRequestFor() > approach > should work just fine. > The problematic case is the Google Translate example. > (IMHO, it should ask permission from user before enabling > speech UI, similar to Google Maps. How is for example gender > recognition less privacy related than location?) > > But, perhaps forthe default speech service, or other speech > services > which user *has* somehow *granted* permissions, permission > management > could be more flexible. What if, while handling user > interaction - say > trusted click event - implementation could immediately call the > successcallback passed to Speech.getRequest(). > Implementation should > still show the UI that recognition is on, and the UI should > have some > way to abort the recognition without giving any data to the > web page. > Also, if the user is concerned about the privacy, (s)he > would never > grant any automatic permissions to speech services, and > would have > to always give the permission when a page first time after > (re-)loading > tries to use speech services. > Effectively in Chrome case this might mean that at some > point the > browser would ask permission to use the default speech > service, and > after that any click on a web page could start recognition. > > Hmmm... this is still pretty scary. And even wrong. We're > dealing with > several different permissions. At least a) is it ok to send > user's > speech data to service X, b) is it ok that web app Y uses speech > services, c) is it ok that web app Y uses service X. > > > a) allows service X to do at least gender recognition, so > there is a > clear privacy data leak to X. > > b) is close to the issues related to current implementation > in Chrome. > Is it ok that whenever user clicks something in a page (any > web page!), > the page may get some recognition results. > > c) if I need to give my social security number to web site Y, is > it ok to use speech service X to recognize the number. > Usually it may be ok to the user to give some data to > service X, but > perhaps ssn is not such data. > > > ...so, my trial to come up with a solution for privacy > handling which > would be ok to Google hasn't yet succeeded. > > > (It is not quite clear to me why the privacy handling of > capturing API > or Geolocation API is ok to Google, but for speech handling > something > else is needed.) > > > -Olli > > > > > > >
Received on Tuesday, 5 July 2011 11:31:18 UTC