- From: Dan Burnett <dburnett@voxeo.com>
- Date: Thu, 28 Oct 2010 20:41:09 -0400
- To: Olli Pettay <olli.pettay@gmail.com>
- Cc: Dave Burke <daveburke@google.com>, Bjorn Bringert <bringert@google.com>, Robert Brown <Robert.Brown@microsoft.com>, Michael Bodell <mbodell@microsoft.com>, Deborah Dahl <dahl@conversational-technologies.com>, Satish Sampath <satish@google.com>, "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
It sounds like we are in violent agreement that this needs to be discussed. We will address it at the face-to-face meeting. -- dan On Oct 26, 2010, at 6:18 PM, Olli Pettay wrote: > On 10/27/2010 01:00 AM, Dave Burke wrote: >> I think this is a really important topic > Indeed. > > and worthy of some F2F >> discussion (appropriately minuted for folks who can't attend, of >> course). > > agree > > > I no more want evil.com <http://evil.com> starting to listen to >> audio in my environment by itself > To me R29 is quite close to what geolocation is. > The sites which want to use ASR, need to get my permission to do so. > Currently geolocation uses usually non-modal dialogs, IIRC. > >> than I would want random site y to be >> able to script a file <input> element to read my local files. And >> popping up permission dialogs (modal or otherwise) doesn't scale >> either >> as eventually surfing the Web would become an exercise in popup spam. > Yes, this is a problem when we're starting to get more and more > dialogs. > I guess many people just ignore non-modal notificationbar-style > dialogs, so the feature is disabled for them. > Modal dialogs are just annoying and people press whatever they need to > to get rid of them. > > I wonder if DAP WG will fix the access control for us > http://dev.w3.org/2009/dap/policy-reqs/ > > -Olli > > >> >> If a user-agent wants to provide some kind of override (e.g. download >> webapp/extension with grouped permission models built into some >> kind of >> application installation metaphor), sure. But let's not break safe >> Web >> browsing for the majority of users by increasing the surface area of >> attack or annoyance. >> >> Dave >> >> On Mon, Oct 25, 2010 at 5:57 PM, Bjorn Bringert <bringert@google.com >> <mailto:bringert@google.com>> wrote: >> >> I don't think that requiring user action necessarily rules out >> hands-free usage. The user action itself could be spoken input. >> The >> key is to not give the *web app* access to recognition results (or >> audio) without user action. We should perhaps reword the >> requirement >> to reflect that. >> >> /Bjorn >> >> On Fri, Oct 22, 2010 at 8:54 PM, Robert Brown >> <Robert.Brown@microsoft.com <mailto:Robert.Brown@microsoft.com>> >> wrote: >> > I don't have data, but I suspect that presenting a list of >> choices, some of which are sticky (like Michael lists below) may >> also mitigate the tendency for users to blindly click-through. >> (That's been my personal experience with some of the popup- >> blocker >> UIs I've used, and there may be data for those) >> > >> > An earcon may also be an appropriate indicator that the >> system is >> listening, especially on smart phones. >> > >> > -----Original Message----- >> > From: public-xg-htmlspeech-request@w3.org >> <mailto:public-xg-htmlspeech-request@w3.org> >> [mailto:public-xg-htmlspeech-request@w3.org >> <mailto:public-xg-htmlspeech-request@w3.org>] On Behalf Of Michael >> Bodell >> > Sent: Friday, October 22, 2010 12:29 PM >> > To: Deborah Dahl; 'Satish Sampath' >> > Cc: 'Bjorn Bringert'; 'Dan Burnett'; public-xg- >> htmlspeech@w3.org >> <mailto:public-xg-htmlspeech@w3.org> >> > Subject: RE: R29. Web application may only listen in response >> to >> user action >> > >> > I agree that this requirement is problematic for hands-free and >> other usage scenarios. IMO explicit user action should not be >> required before speech recognition occurs on each and every page >> load and/or speech recognition. The privacy and security concerns >> that I think we all share is that speech recognition should not >> happen without user consent (in general) and should not happen >> without the user being aware that the speech recognition is >> happening (in this particular instance). Requirements along those >> lines are the "what" requirement that we must follow to support >> user >> privacy and security concerns. But those two requirements do >> *NOT* >> mean that the "web application may only listen in response to user >> action" which is a "how" requirement (I.e., how we protect the >> user). IMO it may be the case that recognition occurs as a result >> of any of (not an exhaustive list): >> > >> > - page load (not covered by this requirement) >> > - focus event driven by explicit user action (covered by this >> requirement) >> > - focus event driven by natural page flow (not covered by this >> requirement) >> > - scripting by the application author (not covered by this >> requirement) >> > >> > This requirement is too restrictive for these and other use >> cases. >> > >> > A different "how" might be that the user agent, for instance, >> prompts the user when a page wants to do speech for the first time >> and gives them a set of choices such as, for example: >> > >> > - Always allow any page to do speech without prompting >> > - Always allow any page on this domain to do speech without >> prompting >> > - Allow just this one page this session to do speech (and >> prompt >> in the future) >> > - Don't allow this page this session to do speech (and prompt >> in >> the future) >> > - Don't ever allow any page on this domain to do speech >> > - Don't ever allow any page ever to do speech >> > >> > If the user chooses one of the "always allow" then in the >> future >> the web application would be able to listen to the user definitely >> without any user action. Or it could be that the user agent has >> the >> equivalent of these prompts instead in some configuration settings >> depending on the security/privacy settings of the user. >> > >> > Maybe people feel that a user agent configuration setting is >> sufficient "user action" to count in this requirement and all of >> the >> use cases above would meet this requirement, but that isn't how I >> interpreted this requirement and I think if that is the case we >> should reword it as the requirement implies there needs to be >> explicit user action each time recognition occurs. >> > >> > As for the user being aware recognition is happening a >> different >> "how" approach is more reasonable IMO: the chrome of the user >> agent >> can provide clues. There is already a well-established pattern >> that >> many user agents provide visual clues when certain things occur, >> for >> example: >> > >> > - a spinning browser icon when content is being loaded in the >> background >> > - some sort of secure lock image when the page is loaded over a >> secure channel >> > >> > So a similar idea could occur when the user is being recorded >> or >> when there is speech recognition going on. Something like a >> microphone icon or a red light or some sort of clue *in the user >> agent's chrome* that doesn't interfere with the visual display of >> the web application. You wouldn't want this indication to occur >> in >> the visual display of the web application itself (I.e., a >> microphone >> icon in the input field) because different web applications may >> want >> different user interface options and also because anything like >> that >> in the visual display of the web application could be spoofed by >> the >> web application and isn't as trusted as icons, images, different >> color/highlighted text in the user agent's chrome. >> > >> > -----Original Message----- >> > From: public-xg-htmlspeech-request@w3.org >> <mailto:public-xg-htmlspeech-request@w3.org> >> [mailto:public-xg-htmlspeech-request@w3.org >> <mailto:public-xg-htmlspeech-request@w3.org>] On Behalf Of >> Deborah Dahl >> > Sent: Friday, October 22, 2010 8:08 AM >> > To: 'Satish Sampath' >> > Cc: 'Bjorn Bringert'; 'Dan Burnett'; public-xg- >> htmlspeech@w3.org >> <mailto:public-xg-htmlspeech@w3.org> >> > Subject: RE: R29. Web application may only listen in response >> to >> user action >> > >> > Yes, you could do that, but then the application wouldn't be >> hands-free. >> > Now probably isn't the time to start talking about approaches >> that would enable us to address both requirements, I'm just >> pointing >> out that we should be aware of a potential conflict. I think we >> should actually classify both requirements as "should address", >> but >> note that there's an issue in our requirements document. >> > >> >> -----Original Message----- >> >> From: Satish Sampath [mailto:satish@google.com >> <mailto:satish@google.com>] >> >> Sent: Friday, October 22, 2010 9:43 AM >> >> To: Deborah Dahl >> >> Cc: Bjorn Bringert; Dan Burnett; public-xg-htmlspeech@w3.org >> <mailto:public-xg-htmlspeech@w3.org> >> >> Subject: Re: R29. Web application may only listen in >> response to >> user >> > action >> >> >> >> One possibility for R24 is that the end user performs an >> action on >> >> page >> > load >> >> and from then on using continuous speech input they can >> interact >> with >> >> the application in a hands-free mode. This could be a click >> on a >> >> button or >> > some >> >> other accessibility-friendly gesture. >> >> >> >> Cheers >> >> Satish >> >> >> >> >> >> >> >> On Fri, Oct 22, 2010 at 2:39 PM, Deborah Dahl >> <dahl@conversational- >> >> technologies.com <http://technologies.com>> wrote: >> >> >> >> >> >> I see a possible conflict between requiring user >> action to >> enable >> >> speech >> >> recognition and R24. "End user should be able to use >> speech in a >> >> hands-free >> >> mode" if "user action" means doing something that >> requires >> use of the >> >> hands. >> >> I think both requirements are important but satisfying >> them both >> >> might >> >> require some thought. >> >> >> >> From: public-xg-htmlspeech-request@w3.org >> <mailto:public-xg-htmlspeech-request@w3.org> >> >> [mailto:public-xg-htmlspeech-request@w3.org >> <mailto:public-xg-htmlspeech-request@w3.org>] On Behalf Of Satish >> >> Sampath >> >> Sent: Friday, October 22, 2010 7:24 AM >> >> To: Bjorn Bringert >> >> Cc: Dan Burnett; public-xg-htmlspeech@w3.org >> <mailto:public-xg-htmlspeech@w3.org> >> >> Subject: Re: R29. Web application may only listen in >> response to >> > user >> >> action >> >> >> >> >> >> User experience studies have also shown that end users >> have got used >> >> to >> >> clicking away any popup dialogs that come up when they >> are >> browsing >> >> the >> >> web.. common ones include phishing/malware warnings, >> download >> >> notifications >> >> etc. This is one of the reasons why browser vendors >> are moving >> >> towards >> >> in-page notifications for some of these where >> applicable, and >> >> requiring >> >> explicit user action for others. So I think this is a >> good >> > requirement to >> >> have. >> >> >> >> The other side of this is that the web page should not >> be >> allowed to >> >> automatically initiate speech input/audio capture via an >> API call. >> >> >> >> Cheers >> >> Satish >> >> >> >> On Fri, Oct 22, 2010 at 12:18 PM, Bjorn Bringert >> >> <bringert@google.com <mailto:bringert@google.com>> >> >> wrote: >> >> This requirement was motivated by privacy concerns. If >> the web >> >> application can start speech recognition at any time, >> it can >> > eavesdrop >> >> on a user. >> >> >> >> An alternative to requiring user action would be to >> have a >> > permission >> >> dialog of some kind. As far as I understand, browser >> implementors >> >> would not like a proliferation of permission dialogs >> annoying their >> >> users. >> >> >> >> /Bjorn >> >> >> >> On Fri, Oct 22, 2010 at 1:06 AM, Dan Burnett >> <dburnett@voxeo.com <mailto:dburnett@voxeo.com>> >> >> wrote: >> >> > Group, >> >> > >> >> > This is the first of the requirements to discuss and >> prioritize >> > based >> >> on >> >> our >> >> > ranking approach [1]. >> >> > >> >> > This email is the beginning of a thread for questions, >> discussion, >> >> and >> >> > opinions regarding our first draft of Requirement 29 [2]. >> >> > >> >> > After our discussion and any modifications to the >> requirement, our >> >> goal is >> >> > to prioritize this requirement as either "Should Address" >> or "For >> >> Future >> >> > Consideration". >> >> > >> >> > -- dan >> >> > >> >> > [1] >> >> > http://lists.w3.org/Archives/Public/public-xg- >> >> htmlspeech/2010Oct/0024.html >> >> > [2] >> >> > >> >> http://lists.w3.org/Archives/Public/public-xg- >> >> htmlspeech/2010Oct/att-0001/sp >> >> eech.html#r29 <http://lists.w3.org/Archives/Public/public-xg- >> >> htmlspeech/2010Oct/att-0001/sp eech.html#r29> >> >> > >> >> > >> >> >> >> >> >> -- >> >> Bjorn Bringert >> >> Google UK Limited, Registered Office: Belgrave House, 76 >> Buckingham >> >> Palace Road, London, SW1W 9TQ >> >> Registered in England Number: 3977902 >> >> >> >> >> >> >> >> >> > >> > >> > >> > >> > >> > >> > >> >> >> >> -- >> Bjorn Bringert >> Google UK Limited, Registered Office: Belgrave House, 76 >> Buckingham >> Palace Road, London, SW1W 9TQ >> Registered in England Number: 3977902 >> >> >
Received on Friday, 29 October 2010 00:41:56 UTC