RE: Revised SpeechRecognitionResult

If I am the only person that finds this feature useful, I will agree to drop the matter.  Let's give it a few days to see how it sorts out.


From: Glen Shires [mailto:gshires@google.com]
Sent: Thursday, June 07, 2012 8:20 AM
To: Hans Wennborg
Cc: Young, Milan; Satish S; public-speech-api@w3.org
Subject: Re: Revised SpeechRecognitionResult

Yes, reducing the complexity of the spec and mental model makes it easier for developers to learn.
Also, if developers are accessing it more than once in their code, they can simply do [1]

[1] http://lists.w3.org/Archives/Public/public-speech-api/2012May/0045.html

Glen Shires
On Thu, Jun 7, 2012 at 7:12 AM, Hans Wennborg <hwennborg@google.com<mailto:hwennborg@google.com>> wrote:
Adding shortcuts, and thereby providing two ways to get to the same
data, counts as complicating the developer's mental model in my book.

I don't think this is a big deal. Could we leave this out of the spec
for now, and if it turns out to be a pain point we can add it in the
future?

Thanks,
Hans

On Wed, Jun 6, 2012 at 8:47 PM, Young, Milan <Milan.Young@nuance.com<mailto:Milan.Young@nuance.com>> wrote:
> Are there any objections to my statement below?  If not, I'd like to push
> this into the spec.
>
> Thanks

>
> From: Young, Milan
> Sent: Wednesday, May 23, 2012 1:39 PM
> To: 'Satish S'; Hans Wennborg
> Cc: public-speech-api@w3.org<mailto:public-speech-api@w3.org>
> Subject: RE: Revised SpeechRecognitionResult
>
>
>
> We can point to standards on both sides of the fence.  Perhaps it is a
> better use of time to consider our particular use case.
>
>
>
> I'd argue that 90% of developers will not even think about the second item
> on the nbest list.  So why complicate their mental model let alone syntax
> with SpeechRecogntionAlternatives?
>
>
>
> For the 10% that do understand an nbest list and its proper use, most will
> be familiar with VoiceXML which shares the same model.
>
>
>
>
>
> From: Satish S [mailto:satish@google.com<mailto:satish@google.com>]
>
> Sent: Wednesday, May 23, 2012 7:02 AM
> To: Hans Wennborg
> Cc: Young, Milan; public-speech-api@w3.org<mailto:public-speech-api@w3.org>
> Subject: Re: Revised SpeechRecognitionResult
>
>
>
> I'd prefer not having such shortcuts in the API. As a parallel, see the W3C
> File API's FileList interface
>
>
> http://www.w3.org/TR/FileAPI/#dfn-filelist
>
> To read the size of a file you'd have to do:
>     var size = document.forms['uploadData']['fileChooser'].files[0].size;
> but that hasn't resulted in a shorter version like
>
>     var size = document.forms['uploadData']['fileChooser'].size;
>
> If developers are accessing "item[0].utterance" more than once in their code
> they'd usually do
>   var item = event.result.item[0];
>   .. = item.utterance
>
> Cheers
> Satish
>
>
> On Wed, May 23, 2012 at 12:11 PM, Hans Wennborg <hwennborg@google.com<mailto:hwennborg@google.com>>
> wrote:
>>
>> On Tue, May 22, 2012 at 7:22 PM, Young, Milan <Milan.Young@nuance.com<mailto:Milan.Young@nuance.com>>
>> wrote:
>> > Hello Hans,
>> >
>> > It's not uncommon for recognition engines to return a guess at what the
>> > user said/meant even for a nomatch result.  So we shouldn't rule this out in
>> > the API.
>>
>> Right. The spec currently says "nomatch event: [...] The result field
>> in the event may contain speech recognition results that are below the
>> confidence threshold or may be null."
>>
>> So that covers both cases.
>>
>> > As far as communicating this with a null vs event, I have a slight
>> > preference for an event.  Two reasons:
>>
>> I'm not sure what you mean by "communication this with a null vs
>> event". I was talking about returning null or throwing an exception.
>> Is that what you mean?
>>
>> >  * Easier for implementers.  This is a true alias.
>>
>> I'm not sure what you mean by true alias.
>>
>> >  * We may want to allow empty interpretations or utterances, and thus a
>> > null would be ambiguous.
>>
>> Ah, yes. So throwing an exception seems like the better option.
>>
>> Thanks,
>> Hans
>>

Received on Thursday, 7 June 2012 16:15:57 UTC