Re: "Protocol" requirement - Re-recognition from Bjorn Bringert on 2010-12-14 (public-xg-htmlspeech@w3.org from December 2010)

From: Bjorn Bringert <bringert@google.com>
Date: Tue, 14 Dec 2010 20:12:53 +0000
To: "Young, Milan" <Milan.Young@nuance.com>
Cc: Robert Brown <Robert.Brown@microsoft.com>, public-xg-htmlspeech@w3.org
Message-ID: <AANLkTim6UFDx8fS_-OcWwA4QkUAA_a2znmsW4xuVy1v+@mail.gmail.com>
What's wrong with prioritizing based on complexity? In the end we want
to finish some specs, and get browser vendors to ship implementations.
Both of those are made easier by reducing complexity.

Robert's imagined implementation suggestion should be doable with
existing requirements btw, since we have requirements for
implementation-dependent parameters and results.

/Bjorn

On Tue, Dec 14, 2010 at 6:29 PM, Young, Milan <Milan.Young@nuance.com> wrote:
> I suspect even the aggressive folks in the group share your aversion to prioritizing based on complexity.  But it's part of the reality of having a tight deadline.  We want to be careful about putting in requirements if they are going to cause a slip.
>
> Regarding implementation, I think you and I are pretty much aligned.  I think having a session end to mark cleanup is preferred over a 404, but not an important issue in the requirements phase.
>
> Thanks
>
>
> -----Original Message-----
> From: Robert Brown [mailto:Robert.Brown@microsoft.com]
> Sent: Monday, December 13, 2010 5:34 PM
> To: Young, Milan; Bjorn Bringert
> Cc: public-xg-htmlspeech@w3.org
> Subject: RE: "Protocol" requirement - Re-recognition
>
>>>  * Is re-recognition a mainstream feature.  Not sure how we could come to agreement on this one outside a vote.
>
> "Mainstream" is hard to define.  Re-reco is definitely valuable and certainly used in common real-world apps.  But if we didn't define it, a speech service or middle tier could still do it behind the scenes.  So it's not critical.  Just desirable.
>
>>>  * How much additional work in spec and implementation would be required for re-recognition.  I suspect if we can come to agreement on session tracking, re-recognition will look a lot like interpretation.
>
> I'm wary of prioritizing based on complexity.  The first question is more important: do we need this to build real world apps?
>
> FWIW, I can *imagine* a pretty simple solution to this.  e.g. say we use HTTP as a basis, if you want re-reco, put a "rereco" header in the request, with no value.  The response includes a token in the rereco header that can be used in a consequent request.  The token could be whatever the service wants to return (a key to a lookup table, a URI, whatever).  If the token has timed-out, the service returns a 410, or a 404.
>
> -----Original Message-----
> From: public-xg-htmlspeech-request@w3.org [mailto:public-xg-htmlspeech-request@w3.org] On Behalf Of Young, Milan
> Sent: Monday, December 13, 2010 11:42 AM
> To: Bjorn Bringert
> Cc: public-xg-htmlspeech@w3.org
> Subject: RE: "Protocol" requirement - Re-recognition
>
> There seem to be two issues at stake here:
>
>  * Is re-recognition a mainstream feature.  Not sure how we could come to agreement on this one outside a vote.
>
>  * How much additional work in spec and implementation would be required for re-recognition.  I suspect if we can come to agreement on session tracking, re-recognition will look a lot like interpretation.
>
> Thanks
>
>
>
> -----Original Message-----
> From: Bjorn Bringert [mailto:bringert@google.com]
> Sent: Monday, December 13, 2010 2:17 AM
> To: Young, Milan
> Cc: public-xg-htmlspeech@w3.org
> Subject: Re: "Protocol" requirement - Re-recognition
>
> While there are use cases for this, I don't think that they are important enough to warrant the increased complexity in managing storage, references, and garbage collection of previously recorded audio. Every feature that we ask browsers to implement makes it harder for them to support our API.
>
> /Bjorn
>
> On Fri, Dec 10, 2010 at 8:03 PM, Young, Milan <Milan.Young@nuance.com> wrote:
>> Summary - Web applications must be able to request recognition based
>> on previously sent audio.
>>
>>
>>
>> Description - It's not always clear which grammars to activate at the
>> start of a dialog.  If selection was incorrect, the user should not be
>> required to repeat in order to try a new grammar.  Due to latency and
>> bandwidth considerations, the protocol must not require that audio
>> needs to be resent in order to accomplish this task.
>>
>>
>
>
>
> --
> Bjorn Bringert
> Google UK Limited, Registered Office: Belgrave House, 76 Buckingham Palace Road, London, SW1W 9TQ Registered in England Number: 3977902
>
>
>



-- 
Bjorn Bringert
Google UK Limited, Registered Office: Belgrave House, 76 Buckingham
Palace Road, London, SW1W 9TQ
Registered in England Number: 3977902
Received on Tuesday, 14 December 2010 20:13:24 UTC