Re: SpeechRecognitionAlternative.interpretation when interpretation can't be provided from Satish S on 2012-08-17 (public-speech-api@w3.org from August 2012)

From: Satish S <satish@google.com>
Date: Fri, 17 Aug 2012 15:41:58 +0100
To: Deborah Dahl <dahl@conversational-technologies.com>
Cc: Bjorn Bringert <bringert@google.com>, Hans Wennborg <hwennborg@google.com>, public-speech-api@w3.org
Message-ID: <CAHZf7RkQ6gGvjdiEyV8MQJ9Gccfc6-7J8HDCinuUSaWO0Xk==A@mail.gmail.com>
> I may have missed something, but I don’t see in the spec where it says
that “interpretation” is optional.

Developers specify the interpretation value with SISR and if they don't
specify there is no 'default' interpretation available. In that sense it is
optional because grammars don't mandate it. So I think this API shouldn't
mandate providing a default value if the engine did not provide one, and
return null in such cases.

Cheers
Satish


On Fri, Aug 17, 2012 at 1:57 PM, Deborah Dahl <
dahl@conversational-technologies.com> wrote:

> I may have missed something, but I don’t see in the spec where it says
> that “interpretation” is optional. ****
>
> *From:* Satish S [mailto:satish@google.com]
> *Sent:* Thursday, August 16, 2012 7:38 PM
> *To:* Deborah Dahl
> *Cc:* Bjorn Bringert; Hans Wennborg; public-speech-api@w3.org
>
> *Subject:* Re: SpeechRecognitionAlternative.interpretation when
> interpretation can't be provided****
>
> ** **
>
> 'interpretation' is an optional attribute because engines are not required
> to provide an interpretation on their own (unlike 'transcript'). As such I
> think it should return null when there isn't a value to be returned as that
> is the convention for optional attributes, not 'undefined' or a copy of
> some other attribute.****
>
> ** **
>
> If an engine chooses to return the same value for 'transcript' and
> 'interpretation' or do textnorm of the value and return in 'interpretation'
> that will be an implementation detail of the engine. But in the absence of
> any such value for 'interpretation' from the engine I think the UA should
> return null.****
>
>
> Cheers
> Satish
>
> ****
>
> On Thu, Aug 16, 2012 at 2:52 PM, Deborah Dahl <
> dahl@conversational-technologies.com> wrote:****
>
> That's a good point. There are lots of use cases where some simple
> normalization is extremely useful, as in your example, or collapsing all
> the ways that the user might say "yes" or "no". However, you could say that
> once the implementation has modified or normalized the transcript that
> means it has some kind of interpretation, so putting a normalized value in
> the interpretation slot should be fine. Nothing says that the
> "interpretation" has to be a particularly fine-grained interpretation, or
> one with a lot of structure.****
>
>
>
> > -----Original Message-----
> > From: Bjorn Bringert [mailto:bringert@google.com]
> > Sent: Thursday, August 16, 2012 9:09 AM
> > To: Hans Wennborg
> > Cc: Conversational; public-speech-api@w3.org
> > Subject: Re: SpeechRecognitionAlternative.interpretation when
> > interpretation can't be provided
> >
> > I'm not sure that it has to be that strict in requiring that the value
> > is the same as the "transcript" attribute. For example, an engine
> > might return the words recognized in "transcript" and apply some extra
> > textnorm to the text that it returns in "interpretation", e.g.
> > converting digit words to digits ("three" -> "3"). Not sure if that's
> > useful though.
> >
> > On Thu, Aug 16, 2012 at 1:58 PM, Hans Wennborg
> > <hwennborg@google.com> wrote:
> > > Yes, the raw text is in the 'transcript' attribute.
> > >
> > > The description of 'interpretation' is currently: "The interpretation
> > > represents the semantic meaning from what the user said. This might be
> > > determined, for instance, through the SISR specification of semantics
> > > in a grammar."
> > >
> > > I propose that we change it to "The interpretation represents the
> > > semantic meaning from what the user said. This might be determined,
> > > for instance, through the SISR specification of semantics in a
> > > grammar. If no semantic meaning can be determined, the attribute must
> > > be a string with the same value as the 'transcript' attribute."
> > >
> > > Does that sound good to everyone? If there are no objections, I'll
> > > make the change to the draft next week.
> > >
> > > Thanks,
> > > Hans
> > >
> > > On Wed, Aug 15, 2012 at 5:29 PM, Conversational
> > > <dahl@conversational-technologies.com> wrote:
> > >> I can't check the spec right now, but I assume there's already an
> attribute
> > that currently is defined to contain the raw text. So I think we could
> say that
> > if there's no interpretation the value of the interpretation attribute
> would be
> > the same as the value of the "raw string" attribute,
> > >>
> > >> Sent from my iPhone
> > >>
> > >> On Aug 15, 2012, at 9:57 AM, Hans Wennborg <hwennborg@google.com>
> > wrote:
> > >>
> > >>> OK, that would work I suppose.
> > >>>
> > >>> What would the spec text look like? Something like "[...] If no
> > >>> semantic meaning can be determined, the attribute will a string
> > >>> representing the raw words that the user spoke."?
> > >>>
> > >>> On Wed, Aug 15, 2012 at 2:24 PM, Bjorn Bringert
> > <bringert@google.com> wrote:
> > >>>> Yeah, that would be my preference too.
> > >>>>
> > >>>> On Wed, Aug 15, 2012 at 2:19 PM, Conversational
> > >>>> <dahl@conversational-technologies.com> wrote:
> > >>>>> If there isn't an interpretation I think it would make the most
> sense
> > for the attribute to contain the literal string result. I believe this
> is what
> > happens in VoiceXML.
> > >>>>>
> > >>>>>> My question is: for implementations that cannot provide an
> > >>>>>> interpretation, what should the attribute's value be? null?
> > undefined?
> >
> >
> >
> > --
> > Bjorn Bringert
> > Google UK Limited, Registered Office: Belgrave House, 76 Buckingham
> > Palace Road, London, SW1W 9TQ
> > Registered in England Number: 3977902
>
>
> ****
>
> ** **
>
Received on Friday, 17 August 2012 14:42:27 UTC