Re: R28. Web application must not be allowed access to raw audio. from Bjorn Bringert on 2010-11-12 (public-xg-htmlspeech@w3.org from November 2010)

From: Bjorn Bringert <bringert@google.com>
Date: Fri, 12 Nov 2010 21:10:56 +0000
To: Michael Bodell <mbodell@microsoft.com>
Cc: Dan Burnett <dburnett@voxeo.com>, "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
Message-ID: <AANLkTimD1cdbnYbRJ1AWgrJ5fvZ8n18GkcHmoiBbrMWQ@mail.gmail.com>

That seems like a fair use case. The original idea behind the
requirement was to disallow eavesdropping I think, but assuming that
the speech recognizer is good, you don't really need raw audio for
that. I'm ok with dropping this requirement altogether.

/Bjorn

On Fri, Nov 12, 2010 at 9:04 PM, Michael Bodell <mbodell@microsoft.com> wrote:
> Hmm, I'm not sure I agree.  Even when using the default recognition service I can easily imagine scenarios that would require access to the audio via a record and recognize interaction.  For a simple example consider a web application that allows a user to issue commands and/or dictate a SMS message and/or leave a voicemail.  You may want to do that all in one user interaction.  Or, it is often the case that you want to recognize speech and if it fails still have the audio.  I.e., attempt to say an audio prompt that will be sent as text in an SMS message, but if the system fails to recognize your message very well you may opt to send it as an audio voice mail message instead.  In this case you want the recognition result (to check with the user if it was ok, and if ok use that), but you also want the recording file in case recognition didn't work.
>
> What is the motivation (or what) that we are trying to capture and could we phrase it or address that without disallowing a user agent to use a more robust default speech service that allow a record and recognize scenario?
>
> -----Original Message-----
> From: public-xg-htmlspeech-request@w3.org [mailto:public-xg-htmlspeech-request@w3.org] On Behalf Of Bjorn Bringert
> Sent: Friday, November 12, 2010 1:22 AM
> To: Dan Burnett
> Cc: public-xg-htmlspeech@w3.org
> Subject: Re: R28. Web application must not be allowed access to raw audio.
>
> Since we already have a requirement that web apps should be allowed to specify their own speech recognizer, this requirement seems impossible to enforce. I think that we should restrict it to:
>
> "When using the user agent's default speech recognizer, web applications must not be given access to the captured audio."
>
> Together with "FPR10. If browser uses speech services other than the default one, it must inform the user which one(s) it is using." this means that captured audio will never be sent to a  web-app specified location without the user being informed.
>
> /Bjorn
>
> On Fri, Nov 12, 2010 at 5:40 AM, Dan Burnett <dburnett@voxeo.com> wrote:
>> Group,
>>
>> This is the next of the requirements to discuss and prioritize based
>> on our ranking approach [1].
>>
>> This email is the beginning of a thread for questions, discussion, and
>> opinions regarding our first draft of Requirement 28 [2].
>>
>> Please discuss via email as we agreed at the Lyon f2f meeting.
>> Outstanding points of contention will be discussed live at an upcoming teleconference.
>>
>> -- dan
>>
>> [1]
>> http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Oct/0024.
>> html
>> [2]
>> http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Oct/att-0
>> 001/speech.html#r28
>>
>>
>
>
>
> --
> Bjorn Bringert
> Google UK Limited, Registered Office: Belgrave House, 76 Buckingham Palace Road, London, SW1W 9TQ Registered in England Number: 3977902
>
>
>



-- 
Bjorn Bringert
Google UK Limited, Registered Office: Belgrave House, 76 Buckingham
Palace Road, London, SW1W 9TQ
Registered in England Number: 3977902

Received on Friday, 12 November 2010 21:11:26 UTC