Re: Revised SpeechRecognitionResult from Hans Wennborg on 2012-06-07 (public-speech-api@w3.org from June 2012)

From: Hans Wennborg <hwennborg@google.com>
Date: Thu, 7 Jun 2012 15:12:35 +0100
To: "Young, Milan" <Milan.Young@nuance.com>
Cc: Satish S <satish@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
Message-ID: <CAB8jPheBiPdhNbmAKp76Z1NgNQcMq_hyxLQDhfmBqTCbFv7KFg@mail.gmail.com>

Adding shortcuts, and thereby providing two ways to get to the same
data, counts as complicating the developer's mental model in my book.

I don't think this is a big deal. Could we leave this out of the spec
for now, and if it turns out to be a pain point we can add it in the
future?

Thanks,
Hans

On Wed, Jun 6, 2012 at 8:47 PM, Young, Milan <Milan.Young@nuance.com> wrote:
> Are there any objections to my statement below?  If not, I’d like to push
> this into the spec.
>
> Thanks

>
> From: Young, Milan
> Sent: Wednesday, May 23, 2012 1:39 PM
> To: 'Satish S'; Hans Wennborg
> Cc: public-speech-api@w3.org
> Subject: RE: Revised SpeechRecognitionResult
>
>
>
> We can point to standards on both sides of the fence.  Perhaps it is a
> better use of time to consider our particular use case.
>
>
>
> I’d argue that 90% of developers will not even think about the second item
> on the nbest list.  So why complicate their mental model let alone syntax
> with SpeechRecogntionAlternatives?
>
>
>
> For the 10% that do understand an nbest list and its proper use, most will
> be familiar with VoiceXML which shares the same model.
>
>
>
>
>
> From: Satish S [mailto:satish@google.com]
>
> Sent: Wednesday, May 23, 2012 7:02 AM
> To: Hans Wennborg
> Cc: Young, Milan; public-speech-api@w3.org
> Subject: Re: Revised SpeechRecognitionResult
>
>
>
> I'd prefer not having such shortcuts in the API. As a parallel, see the W3C
> File API's FileList interface
>
>
> http://www.w3.org/TR/FileAPI/#dfn-filelist
>
> To read the size of a file you'd have to do:
>     var size = document.forms['uploadData']['fileChooser'].files[0].size;
> but that hasn't resulted in a shorter version like
>
>     var size = document.forms['uploadData']['fileChooser'].size;
>
> If developers are accessing "item[0].utterance" more than once in their code
> they'd usually do
>   var item = event.result.item[0];
>   .. = item.utterance
>
> Cheers
> Satish
>
>
> On Wed, May 23, 2012 at 12:11 PM, Hans Wennborg <hwennborg@google.com>
> wrote:
>>
>> On Tue, May 22, 2012 at 7:22 PM, Young, Milan <Milan.Young@nuance.com>
>> wrote:
>> > Hello Hans,
>> >
>> > It's not uncommon for recognition engines to return a guess at what the
>> > user said/meant even for a nomatch result.  So we shouldn't rule this out in
>> > the API.
>>
>> Right. The spec currently says "nomatch event: [...] The result field
>> in the event may contain speech recognition results that are below the
>> confidence threshold or may be null."
>>
>> So that covers both cases.
>>
>> > As far as communicating this with a null vs event, I have a slight
>> > preference for an event.  Two reasons:
>>
>> I'm not sure what you mean by "communication this with a null vs
>> event". I was talking about returning null or throwing an exception.
>> Is that what you mean?
>>
>> >  * Easier for implementers.  This is a true alias.
>>
>> I'm not sure what you mean by true alias.
>>
>> >  * We may want to allow empty interpretations or utterances, and thus a
>> > null would be ambiguous.
>>
>> Ah, yes. So throwing an exception seems like the better option.
>>
>> Thanks,
>> Hans
>>

Received on Thursday, 7 June 2012 14:13:24 UTC