W3C home > Mailing lists > Public > public-speech-api@w3.org > June 2012

Re: Revised SpeechRecognitionResult

From: Hans Wennborg <hwennborg@google.com>
Date: Thu, 7 Jun 2012 15:12:35 +0100
Message-ID: <CAB8jPheBiPdhNbmAKp76Z1NgNQcMq_hyxLQDhfmBqTCbFv7KFg@mail.gmail.com>
To: "Young, Milan" <Milan.Young@nuance.com>
Cc: Satish S <satish@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
Adding shortcuts, and thereby providing two ways to get to the same
data, counts as complicating the developer's mental model in my book.

I don't think this is a big deal. Could we leave this out of the spec
for now, and if it turns out to be a pain point we can add it in the
future?

Thanks,
Hans

On Wed, Jun 6, 2012 at 8:47 PM, Young, Milan <Milan.Young@nuance.com> wrote:
> Are there any objections to my statement below? If not, Id like to push
> this into the spec.
>
> Thanks

>
> From: Young, Milan
> Sent: Wednesday, May 23, 2012 1:39 PM
> To: 'Satish S'; Hans Wennborg
> Cc: public-speech-api@w3.org
> Subject: RE: Revised SpeechRecognitionResult
>
>
>
> We can point to standards on both sides of the fence. Perhaps it is a
> better use of time to consider our particular use case.
>
>
>
> Id argue that 90% of developers will not even think about the second item
> on the nbest list. So why complicate their mental model let alone syntax
> with SpeechRecogntionAlternatives?
>
>
>
> For the 10% that do understand an nbest list and its proper use, most will
> be familiar with VoiceXML which shares the same model.
>
>
>
>
>
> From: Satish S [mailto:satish@google.com]
>
> Sent: Wednesday, May 23, 2012 7:02 AM
> To: Hans Wennborg
> Cc: Young, Milan; public-speech-api@w3.org
> Subject: Re: Revised SpeechRecognitionResult
>
>
>
> I'd prefer not having such shortcuts in the API. As a parallel, see the W3C
> File API's FileList interface
>
>
> http://www.w3.org/TR/FileAPI/#dfn-filelist
>
> To read the size of a file you'd have to do:
>   var size = document.forms['uploadData']['fileChooser'].files[0].size;
> but that hasn't resulted in a shorter version like
>
>   var size = document.forms['uploadData']['fileChooser'].size;
>
> If developers are accessing "item[0].utterance" more than once in their code
> they'd usually do
>  var item = event.result.item[0];
>  .. = item.utterance
>
> Cheers
> Satish
>
>
> On Wed, May 23, 2012 at 12:11 PM, Hans Wennborg <hwennborg@google.com>
> wrote:
>>
>> On Tue, May 22, 2012 at 7:22 PM, Young, Milan <Milan.Young@nuance.com>
>> wrote:
>> > Hello Hans,
>> >
>> > It's not uncommon for recognition engines to return a guess at what the
>> > user said/meant even for a nomatch result. So we shouldn't rule this out in
>> > the API.
>>
>> Right. The spec currently says "nomatch event: [...] The result field
>> in the event may contain speech recognition results that are below the
>> confidence threshold or may be null."
>>
>> So that covers both cases.
>>
>> > As far as communicating this with a null vs event, I have a slight
>> > preference for an event. Two reasons:
>>
>> I'm not sure what you mean by "communication this with a null vs
>> event". I was talking about returning null or throwing an exception.
>> Is that what you mean?
>>
>> > * Easier for implementers. This is a true alias.
>>
>> I'm not sure what you mean by true alias.
>>
>> > * We may want to allow empty interpretations or utterances, and thus a
>> > null would be ambiguous.
>>
>> Ah, yes. So throwing an exception seems like the better option.
>>
>> Thanks,
>> Hans
>>
Received on Thursday, 7 June 2012 14:13:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 7 June 2012 14:13:24 GMT