RE: "Protocol" requirement - Re-recognition from Robert Brown on 2010-12-14 (public-xg-htmlspeech@w3.org from December 2010)

From: Robert Brown <Robert.Brown@microsoft.com>
Date: Tue, 14 Dec 2010 01:33:56 +0000
To: "Young, Milan" <Milan.Young@nuance.com>, Bjorn Bringert <bringert@google.com>
CC: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
Message-ID: <113BCF28740AF44989BE7D3F84AE18DD197E5693@TK5EX14MBXC114.redmond.corp.microsoft.>

>>  * Is re-recognition a mainstream feature.  Not sure how we could come to agreement on this one outside a vote.

"Mainstream" is hard to define.  Re-reco is definitely valuable and certainly used in common real-world apps.  But if we didn't define it, a speech service or middle tier could still do it behind the scenes.  So it's not critical.  Just desirable.

>>  * How much additional work in spec and implementation would be required for re-recognition.  I suspect if we can come to agreement on session tracking, re-recognition will look a lot like interpretation.

I'm wary of prioritizing based on complexity.  The first question is more important: do we need this to build real world apps?

FWIW, I can *imagine* a pretty simple solution to this.  e.g. say we use HTTP as a basis, if you want re-reco, put a "rereco" header in the request, with no value.  The response includes a token in the rereco header that can be used in a consequent request.  The token could be whatever the service wants to return (a key to a lookup table, a URI, whatever).  If the token has timed-out, the service returns a 410, or a 404.

-----Original Message-----
From: public-xg-htmlspeech-request@w3.org [mailto:public-xg-htmlspeech-request@w3.org] On Behalf Of Young, Milan
Sent: Monday, December 13, 2010 11:42 AM
To: Bjorn Bringert
Cc: public-xg-htmlspeech@w3.org
Subject: RE: "Protocol" requirement - Re-recognition

There seem to be two issues at stake here:

  * Is re-recognition a mainstream feature.  Not sure how we could come to agreement on this one outside a vote.

  * How much additional work in spec and implementation would be required for re-recognition.  I suspect if we can come to agreement on session tracking, re-recognition will look a lot like interpretation.

Thanks

-----Original Message-----
From: Bjorn Bringert [mailto:bringert@google.com]
Sent: Monday, December 13, 2010 2:17 AM
To: Young, Milan
Cc: public-xg-htmlspeech@w3.org
Subject: Re: "Protocol" requirement - Re-recognition

While there are use cases for this, I don't think that they are important enough to warrant the increased complexity in managing storage, references, and garbage collection of previously recorded audio. Every feature that we ask browsers to implement makes it harder for them to support our API.

/Bjorn

On Fri, Dec 10, 2010 at 8:03 PM, Young, Milan <Milan.Young@nuance.com> wrote:
> Summary - Web applications must be able to request recognition based 
> on previously sent audio.
>
>
>
> Description - It's not always clear which grammars to activate at the 
> start of a dialog.  If selection was incorrect, the user should not be 
> required to repeat in order to try a new grammar.  Due to latency and 
> bandwidth considerations, the protocol must not require that audio 
> needs to be resent in order to accomplish this task.
>
>

--
Bjorn Bringert
Google UK Limited, Registered Office: Belgrave House, 76 Buckingham Palace Road, London, SW1W 9TQ Registered in England Number: 3977902

Received on Tuesday, 14 December 2010 01:34:31 UTC