W3C home > Mailing lists > Public > public-speech-api@w3.org > September 2012

Re: SpeechRecognitionAlternative.interpretation when interpretation can't be provided

From: Glen Shires <gshires@google.com>
Date: Tue, 11 Sep 2012 17:32:59 -0700
Message-ID: <CAEE5bciDr4bmmVruPi0q+oiXHiqTr_HZ2ybGAKoCfGcSaJeUpw@mail.gmail.com>
To: Deborah Dahl <dahl@conversational-technologies.com>, Jim Barnett <Jim.Barnett@genesyslab.com>, Hans Wennborg <hwennborg@google.com>, Satish S <satish@google.com>, Bjorn Bringert <bringert@google.com>, public-speech-api@w3.org
The current definition of interpretation in the spec is:

    "The interpretation represents the semantic meaning from what the user
said. This might be determined, for instance, through the SISR
specification of semantics in a grammar."

I propose adding an additional sentence at the end.

    "If no interpretation is available, this attribute MUST return null."

My reasoning (based on this lengthy thread):

   - If an SISR / etc interpretation is available, the UA must return it.
   - If an alternative string interpretation is available, such as
   a normalization, the UA may return it.
   - If there's no more information available than in the transcript, then
   "null" provides a very simple way for the author to check for this
   condition. The author avoids a clumsy conditional (typeof(interpretation)
   != "string") and the author can easily distinguish between the case when
   the interpretation returns a normalization string as opposed to if it had
   just copied the transcript verbatim.
   - "null" is more commonly used than "undefined" in these circumstances.

If there's no disagreement, I will add this sentence to the spec on
Thursday.
/Glen Shires


On Tue, Sep 4, 2012 at 11:04 AM, Glen Shires <gshires@google.com> wrote:

> I've updated the spec with this change (moved interpretation and emma
> attributes to SpeechRecognitionEvent):
> https://dvcs.w3.org/hg/speech-api/rev/48a58e558fcc
>
> As always, the current draft spec is at:
> http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
>
> /Glen Shires
>
> On Thu, Aug 30, 2012 at 10:07 AM, Deborah Dahl <
> dahl@conversational-technologies.com> wrote:
>
>> Thanks for the clarification, that makes sense.  When each new version of
>> the emma document arrives in a  SpeechRecognitionEvent, the author can just
>> repopulate all the  earlier form fields, as well as the newest one, with
>> the data from the most recent emma version. ****
>>
>> ** **
>>
>> *From:* Glen Shires [mailto:gshires@google.com]
>> *Sent:* Thursday, August 30, 2012 12:45 PM
>>
>> *To:* Deborah Dahl
>> *Cc:* Jim Barnett; Hans Wennborg; Satish S; Bjorn Bringert;
>> public-speech-api@w3.org
>> *Subject:* Re: SpeechRecognitionAlternative.interpretation when
>> interpretation can't be provided****
>>
>> ** **
>>
>> Debbie,****
>>
>> In my proposal, the single emma document is updated with each
>> new SpeechRecognitionEvent. Therefore, in continuous = true mode, the emma
>> document is populated in "real time" as the user speaks each field, without
>> waiting for the user to finish speaking. A JavaScript author could use this
>> to populate a form in "real time".****
>>
>> ** **
>>
>> ** **
>>
>> Also, I now realize that the SpeechRecognitionEvent.transcript is not
>> useful in continuous = false mode because only one final result is
>> returned, and thus SpeechRecognitionEvent.results[0].transcript always
>> contains the same string (no concatenation needed).  I also don't see it as
>> very useful in continuous = true mode because if an author is using this
>> mode, it's presumably because he wants to show continuous final results
>> (and perhaps interim as well). Since the author is already writing code to
>> concatenate results to display them "real-time", there's little or no
>> savings with this new attribute.  So I now retract that portion of my
>> proposal.****
>>
>> ** **
>>
>> So to clarify, here's my proposed changes to the spec. If there's no
>> disagreement by the end of the week I'll add it to the spec...****
>>
>> ** **
>>
>> ** **
>>
>> Delete SpeechRecognitionAlternative.interpretation****
>>
>> ** **
>>
>> Delete SpeechRecognitionResult.emma****
>>
>> ** **
>>
>> Add interpretation and emma attributes to SpeechRecognitionEvent.
>>  Specifically:****
>>
>> ** **
>>
>>     interface SpeechRecognitionEvent : Event {****
>>
>>         readonly attribute short resultIndex;****
>>
>>         readonly attribute SpeechRecognitionResultList results;****
>>
>>         readonly attribute any interpretation;****
>>
>>         readonly attribute Document emma;****
>>
>>     };****
>>
>> ** **
>>
>> I do not propose to change the definitions of interpretation and emma at
>> this time (because there is on-going discussion), but rather to simply move
>> their current definitions to the new heading: "5.1.8 Speech Recognition
>> Event".****
>>
>> ** **
>>
>> /Glen Shires****
>>
>> ** **
>>
>> ** **
>>
>> On Thu, Aug 30, 2012 at 8:36 AM, Deborah Dahl <
>> dahl@conversational-technologies.com> wrote:****
>>
>> Hi Glenn,****
>>
>> I agree that a single cumulative emma document is preferable to multiple
>> emma documents in general, although I think that there might be use cases
>> where it would be convenient to have both.  For example, you want to
>> populate a form in real time as the user speaks each field, without waiting
>> for the user to finish speaking. After the result is final the application
>> could send the cumulative result to the server, but seeing the interim
>> results would be helpful feedback to the user.****
>>
>> Debbie****
>>
>> *From:* Glen Shires [mailto:gshires@google.com]
>> *Sent:* Wednesday, August 29, 2012 2:57 PM
>> *To:* Deborah Dahl
>> *Cc:* Jim Barnett; Hans Wennborg; Satish S; Bjorn Bringert;
>> public-speech-api@w3.org****
>>
>>
>> *Subject:* Re: SpeechRecognitionAlternative.interpretation when
>> interpretation can't be provided****
>>
>>  ****
>>
>> I believe the same is true for emma, a single, cumulative emma document
>> is preferable to multiple emma documents. ****
>>
>>  ****
>>
>> I propose the following changes to the spec:****
>>
>>  ****
>>
>> Delete SpeechRecognitionAlternative.interpretation****
>>
>>  ****
>>
>> Delete SpeechRecognitionResult.emma****
>>
>>  ****
>>
>> Add interpretation and emma attributes to SpeechRecognitionEvent.
>>  Specifically:****
>>
>>  ****
>>
>>     interface SpeechRecognitionEvent : Event {****
>>
>>         readonly attribute short resultIndex;****
>>
>>         readonly attribute SpeechRecognitionResultList results;****
>>
>>         readonly attribute DOMString transcript;****
>>
>>         readonly attribute any interpretation;****
>>
>>         readonly attribute Document emma;****
>>
>>     };****
>>
>>  ****
>>
>> I do not propose to change the definitions of interpretation and emma at
>> this time (because there is on-going discussion), but rather to simply move
>> their current definitions to the new heading: "5.1.8 Speech Recognition
>> Event".****
>>
>>  ****
>>
>> I also propose adding transcript attribute to SpeechRecognitionEvent (but
>> also retaining SpeechRecognitionAlternative.transcript). This provides a
>> simple option for JavaScript authors to get at the full, cumulative
>> transcript.  I propose the definition under "5.1.8 Speech Recognition
>> Event" be:****
>>
>>  ****
>>
>> transcript****
>>
>> The transcript string represents the raw words that the user spoke. This
>> is a concatenation of the first (highest confidence) alternative of all
>> final SpeechRecognitionAlternative.transcript strings.****
>>
>>  ****
>>
>> /Glen Shires ****
>>
>>  ****
>>
>>  ****
>>
>> On Wed, Aug 29, 2012 at 10:30 AM, Deborah Dahl <
>> dahl@conversational-technologies.com> wrote:****
>>
>> I agree with having a single interpretation that represents the
>> cumulative interpretation of the utterance so far. ****
>>
>> I think an example of what Jim is talking about, when the interpretation
>> wouldn’t be final even if the transcript is, might be the utterance “from
>> Chicago … Midway”. Maybe the grammar has a default of “Chicago O’Hare”, and
>> returns “from: ORD”, because most people don’t bother to say “O’Hare”, but
>> then it hears “Midway” and changes the interpretation to “from: MDW”.
>>  However, “from Chicago” is still the transcript. ****
>>
>> Also the problem that Glenn points out is bad enough with two slots, but
>> it gets even worse as the number of slots gets bigger. For example, you
>> might have a pizza-ordering utterance with five or six ingredients (“I want
>> a large pizza with mushrooms…pepperoni…onions…olives…anchovies”). It would
>> be very cumbersome to have to go back through all the results to fill in
>> the slots separately.****
>>
>>  ****
>>
>> *From:* Jim Barnett [mailto:Jim.Barnett@genesyslab.com]
>> *Sent:* Wednesday, August 29, 2012 12:37 PM
>> *To:* Glen Shires; Deborah Dahl****
>>
>>
>> *Cc:* Hans Wennborg; Satish S; Bjorn Bringert; public-speech-api@w3.org**
>> **
>>
>> *Subject:* RE: SpeechRecognitionAlternative.interpretation when
>> interpretation can't be provided****
>>
>>  ****
>>
>> I agree with the idea of having a single interpretation.  There is no
>> guarantee that the different parts of the string have independent
>> interpretations.  For example, even if the transcription “from New York” is
>> final,  its interpretation may not  be, since it may depend on the
>> remaining parts of the utterance (that depends on how complicated the
>> grammar is, of course.)  ****
>>
>>  ****
>>
>> -          Jim****
>>
>>  ****
>>
>> *From:* Glen Shires [mailto:gshires@google.com]
>> *Sent:* Wednesday, August 29, 2012 11:44 AM
>> *To:* Deborah Dahl
>> *Cc:* Hans Wennborg; Satish S; Bjorn Bringert; public-speech-api@w3.org
>> *Subject:* Re: SpeechRecognitionAlternative.interpretation when
>> interpretation can't be provided****
>>
>>  ****
>>
>> How should interpretation work with continuous speech?****
>>
>>  ****
>>
>> Specifically, as each portion becomes final (each SpeechRecognitionResult
>> with final=true), the corresponding alternative(s) for transcription and
>> interpretation become final.****
>>
>>  ****
>>
>> It's easy for the JavaScript author to handle the consecutive list of
>> transcription strings - simply concatenate them.****
>>
>>  ****
>>
>> However, if the interpretation returns a semantic structure (such as the
>> depart/arrive example), it's unclear to me how they should be returned.
>>  For example, if the first final result was "from New York" and the second
>> "to San Francisco", then:****
>>
>>  ****
>>
>> After the first final result, the list is:****
>>
>>  ****
>>
>> event.results[0].item[0].transcription = "from New York"****
>>
>> event.results[0].item[0].interpretation = {****
>>
>>   depart: "New York",****
>>
>>   arrive: null****
>>
>> };****
>>
>>  ****
>>
>> After the second final result, the list is:****
>>
>>  ****
>>
>> event.results[0].item[0].transcription = "from New York"****
>>
>> event.results[0].item[0].interpretation = {****
>>
>>   depart: "New York",****
>>
>>   arrive: null****
>>
>> };****
>>
>>  ****
>>
>> event.results[1].item[0].transcription = "to San Francisco"****
>>
>> event.results[1].item[0].interpretation = {****
>>
>>   depart: null,****
>>
>>   arrive: "San Francisco"****
>>
>> };****
>>
>>  ****
>>
>> If so, this makes using the interpretation structure very messy for the
>> author because he needs to loop through all the results to find each
>> interpretation slot that he needs.****
>>
>>  ****
>>
>> I suggest that we instead consider changing the spec to provide a single
>> interpretation that always represents the most current interpretation.***
>> *
>>
>>  ****
>>
>> After the first final result, the list is:****
>>
>>  ****
>>
>> event.results[0].item[0].transcription = "from New York"****
>>
>> event.interpretation = {****
>>
>>   depart: "New York",****
>>
>>   arrive: null****
>>
>> };****
>>
>>  ****
>>
>> After the second final result, the list is:****
>>
>>  ****
>>
>> event.results[0].item[0].transcription = "from New York"****
>>
>> event.results[1].item[0].transcription = "to San Francisco"****
>>
>> event.interpretation = {****
>>
>>   depart: "New York",****
>>
>>   arrive: "San Francisco"****
>>
>> };****
>>
>>  ****
>>
>> This not only makes it simple for the author to process the
>> interpretation, it also solves the problem that the interpretation may not
>> be available at the same point in time that the transcription becomes
>> final.  If alternative interpretations are important, then it's easy to add
>> them to the interpretation structure that is returned, and this format far
>> easier for the author to process than
>> multiple SpeechRecognitionAlternative.interpretations.  For example:****
>>
>>  ****
>>
>> event.interpretation = {****
>>
>>   depart: ["New York", "Newark"],****
>>
>>   arrive: ["San Francisco", "San Bernardino"],****
>>
>> };****
>>
>>  ****
>>
>> /Glen Shires****
>>
>>  ****
>>
>> On Wed, Aug 29, 2012 at 7:07 AM, Deborah Dahl <
>> dahl@conversational-technologies.com> wrote:****
>>
>> I don’t think there’s a big difference in complexity in this use case,
>> but here’s another one, that I think might be more common.****
>>
>> Suppose the application is something like search or composing email, and
>> the transcript alone would serve the application's purposes. However, some
>> implementations might also provide useful normalizations like converting
>> text numbers to digits or capitalization that would make the dictated text
>> look more like written language, and this normalization fills the
>> "interpretation slot". If the developer can count on the "interpretation"
>> slot being filled by the transcript if there's nothing better, then the
>> developer only has to ask for the interpretation. ****
>>
>> e.g. ****
>>
>> document.write(interpretation)****
>>
>>  ****
>>
>> vs. ****
>>
>> if(intepretation)****
>>
>>                 document.write(interpretation)****
>>
>> else****
>>
>>                 document.write(transcript)****
>>
>>  ****
>>
>> which I think is simpler. The developer doesn’t have to worry about type
>> checking because in this application the “interpretation” will always be a
>> string.****
>>
>> *From:* Glen Shires [mailto:gshires@google.com]
>> *Sent:* Tuesday, August 28, 2012 10:44 PM
>> *To:* Deborah Dahl****
>>
>>
>> *Cc:* Hans Wennborg; Satish S; Bjorn Bringert; public-speech-api@w3.org
>> *Subject:* Re: SpeechRecognitionAlternative.interpretation when
>> interpretation can't be provided****
>>
>>  ****
>>
>> Debbie,****
>>
>> Looking at this from the viewpoint of what is easier for the JavaScript
>> author, I believe:****
>>
>>  ****
>>
>> SpeechRecognitionAlternative.transcript must return a string (even if an
>> empty string). Thus, an author wishing to use the transcript doesn't need
>> to perform any type checking.****
>>
>>  ****
>>
>> SpeechRecognitionAlternative.interpretation must be null if no
>> interpretation is provided.  This simplifies the required conditional by
>> eliminating type checking.  For example:****
>>
>>  ****
>>
>> transcript = "from New York to San Francisco";****
>>
>>  ****
>>
>> interpretation = {****
>>
>>   depart: "New York",****
>>
>>   arrive: "San Francisco"****
>>
>> };****
>>
>>  ****
>>
>> if (interpretation)  // this works if interpretation is present or if null
>> ****
>>
>>   document.write("Depart " + interpretation.depart + " and arrive in " +
>> interpretation.arrive);****
>>
>> else****
>>
>>   document.write(transcript);****
>>
>> fi****
>>
>>  ****
>>
>>  ****
>>
>> Whereas, if the interpretation contains the transcript string when no
>> interpretation is present, the condition would have to be:****
>>
>>  ****
>>
>> if (typeof(interpretation) != "string")****
>>
>>  ****
>>
>> Which is more complex, and more prone to errors (e.g. if spell "string"
>> wrong).****
>>
>>  ****
>>
>> /Glen Shires****
>>
>>  ****
>>
>>  ****
>>
>> On Thu, Aug 23, 2012 at 6:37 AM, Deborah Dahl <
>> dahl@conversational-technologies.com> wrote:****
>>
>> Hi Glenn,****
>>
>> In the case of an SLM, if there’s a classification, I think the
>> classification would be the interpretation. If the SLM is just used to
>> improve dictation results, without classification, then the interpretation
>> would be whatever we say it is – either the transcript, null, or undefined.
>> ****
>>
>> My point about stating that the “transcript” attribute is required or
>> optional wasn’t whether or not there was a use case where it would be
>> desirable not to return a transcript. My point was that the spec needs to
>> be explicit about the optional/required status of every feature. It’s
>> fine to postpone that decision if there’s any controversy, but if we all
>> agree we might as well add it to the spec. ****
>>
>> I can’t think of any cases where it would be bad to return a transcript,
>> although I can think of use cases where the developer wouldn’t choose to do
>> anything with the transcript (like multi-slot form filling – all the end
>> user really needs to see is the correctly filled slots). ****
>>
>> Debbie****
>>
>>  ****
>>
>> *From:* Glen Shires [mailto:gshires@google.com]
>> *Sent:* Thursday, August 23, 2012 3:48 AM
>> *To:* Deborah Dahl
>> *Cc:* Hans Wennborg; Satish S; Bjorn Bringert; public-speech-api@w3.org**
>> **
>>
>>
>> *Subject:* Re: SpeechRecognitionAlternative.interpretation when
>> interpretation can't be provided****
>>
>>  ****
>>
>> Debbie,****
>>
>> I agree with the need to support SLMs. This implies that, in some cases,
>> the author may not specify semantic information, and thus there would not
>> be an interpretation.****
>>
>>  ****
>>
>> Under what circumstances (except error conditions) do you envision that a
>> transcript would not be returned?****
>>
>>  ****
>>
>> /Glen Shires****
>>
>>  ****
>>
>> On Wed, Aug 22, 2012 at 6:08 AM, Deborah Dahl <
>> dahl@conversational-technologies.com> wrote:****
>>
>> Actually, Satish's comment made me think that we probably have a few other
>> things to agree on before we decide what the default value of
>> "interpretation" should be, because we haven't settled on a lot of issues
>> about what is required and what is optional.
>> Satish's argument is only relevant if we require SRGS/SISR for grammars
>> and
>> semantic interpretation, but we actually don't require either of those
>> right
>> now, so it doesn't matter what they do as far as the current spec goes.
>> (Although it's worth noting that  SRGS doesn't require anything to be
>> returned at all, even the transcript
>> http://www.w3.org/TR/speech-grammar/#S1.10).
>> So I think we first need to decide and explicitly state in the spec ---
>>
>> 1. what we want to say about grammar formats (which are allowed/required,
>> or
>> is the grammar format open). It probably needs to be somewhat open because
>> of SLM's.
>> 2. what we want to say about semantic tag formats (are proprietary formats
>> allowed, is SISR required or is the semantic tag format just whatever the
>> grammar format uses)
>> 3. is "transcript" required?
>> 4. is "interpretation" required?
>>
>> Debbie****
>>
>>
>> > -----Original Message-----
>> > From: Hans Wennborg [mailto:hwennborg@google.com]
>> > Sent: Tuesday, August 21, 2012 12:50 PM
>> > To: Glen Shires
>> > Cc: Satish S; Deborah Dahl; Bjorn Bringert; public-speech-api@w3.org
>> > Subject: Re: SpeechRecognitionAlternative.interpretation when
>> > interpretation can't be provided
>> >
>> > Björn, Deborah, are you ok with this as well? I.e. that the spec
>> > shouldn't mandate a "default" value for the interpretation attribute,
>> > but rather return null when there is no interpretation?
>> >
>> > On Fri, Aug 17, 2012 at 6:32 PM, Glen Shires <gshires@google.com>
>> wrote:
>> > > I agree, return "null" (not "undefined") in such cases.
>> > >
>> > >
>> > > On Fri, Aug 17, 2012 at 7:41 AM, Satish S <satish@google.com> wrote:
>> > >>
>> > >> > I may have missed something, but I don’t see in the spec where it
>> says
>> > >> > that “interpretation” is optional.
>> > >>
>> > >> Developers specify the interpretation value with SISR and if they
>> don't
>> > >> specify there is no 'default' interpretation available. In that sense
>> it is
>> > >> optional because grammars don't mandate it. So I think this API
>> shouldn't
>> > >> mandate providing a default value if the engine did not provide one,
>> and
>> > >> return null in such cases.
>>
>>
>>
>> > >>
>> > >> Cheers
>> > >> Satish
>> > >>
>> > >>
>> > >>
>> > >> On Fri, Aug 17, 2012 at 1:57 PM, Deborah Dahl
>> > >> <dahl@conversational-technologies.com> wrote:
>> > >>>
>> > >>> I may have missed something, but I don’t see in the spec where it
>> says
>> > >>> that “interpretation” is optional.
>> > >>>
>> > >>> From: Satish S [mailto:satish@google.com]
>> > >>> Sent: Thursday, August 16, 2012 7:38 PM
>> > >>> To: Deborah Dahl
>> > >>> Cc: Bjorn Bringert; Hans Wennborg; public-speech-api@w3.org
>> > >>>
>> > >>>
>> > >>> Subject: Re: SpeechRecognitionAlternative.interpretation when
>> > >>> interpretation can't be provided
>> > >>>
>> > >>>
>> > >>>
>> > >>> 'interpretation' is an optional attribute because engines are not
>> > >>> required to provide an interpretation on their own (unlike
>> 'transcript').
>> > As
>> > >>> such I think it should return null when there isn't a value to be
>> returned
>> > >>> as that is the convention for optional attributes, not 'undefined'
>> or
>> a
>> > copy
>> > >>> of some other attribute.
>> > >>>
>> > >>>
>> > >>>
>> > >>> If an engine chooses to return the same value for 'transcript' and
>> > >>> 'interpretation' or do textnorm of the value and return in
>> 'interpretation'
>> > >>> that will be an implementation detail of the engine. But in the
>> absence
>> > of
>> > >>> any such value for 'interpretation' from the engine I think the UA
>> should
>> > >>> return null.
>> > >>>
>> > >>>
>> > >>> Cheers
>> > >>> Satish
>> > >>>
>> > >>> On Thu, Aug 16, 2012 at 2:52 PM, Deborah Dahl
>> > >>> <dahl@conversational-technologies.com> wrote:
>> > >>>
>> > >>> That's a good point. There are lots of use cases where some simple
>> > >>> normalization is extremely useful, as in your example, or collapsing
>> all
>> > the
>> > >>> ways that the user might say "yes" or "no". However, you could say
>> that
>> > once
>> > >>> the implementation has modified or normalized the transcript that
>> > means it
>> > >>> has some kind of interpretation, so putting a normalized value in
>> the
>> > >>> interpretation slot should be fine. Nothing says that the
>> "interpretation"
>> > >>> has to be a particularly fine-grained interpretation, or one with a
>> lot of
>> > >>> structure.
>> > >>>
>> > >>>
>> > >>>
>> > >>> > -----Original Message-----
>> > >>> > From: Bjorn Bringert [mailto:bringert@google.com]
>> > >>> > Sent: Thursday, August 16, 2012 9:09 AM
>> > >>> > To: Hans Wennborg
>> > >>> > Cc: Conversational; public-speech-api@w3.org
>> > >>> > Subject: Re: SpeechRecognitionAlternative.interpretation when
>> > >>> > interpretation can't be provided
>> > >>> >
>> > >>> > I'm not sure that it has to be that strict in requiring that the
>> value
>> > >>> > is the same as the "transcript" attribute. For example, an engine
>> > >>> > might return the words recognized in "transcript" and apply some
>> > extra
>> > >>> > textnorm to the text that it returns in "interpretation", e.g.
>> > >>> > converting digit words to digits ("three" -> "3"). Not sure if
>> that's
>> > >>> > useful though.
>> > >>> >
>> > >>> > On Thu, Aug 16, 2012 at 1:58 PM, Hans Wennborg
>> > >>> > <hwennborg@google.com> wrote:
>> > >>> > > Yes, the raw text is in the 'transcript' attribute.
>> > >>> > >
>> > >>> > > The description of 'interpretation' is currently: "The
>> interpretation
>> > >>> > > represents the semantic meaning from what the user said. This
>> > might
>> > >>> > > be
>> > >>> > > determined, for instance, through the SISR specification of
>> semantics
>> > >>> > > in a grammar."
>> > >>> > >
>> > >>> > > I propose that we change it to "The interpretation represents
>> the
>> > >>> > > semantic meaning from what the user said. This might be
>> > determined,
>> > >>> > > for instance, through the SISR specification of semantics in a
>> > >>> > > grammar. If no semantic meaning can be determined, the attribute
>> > must
>> > >>> > > be a string with the same value as the 'transcript' attribute."
>> > >>> > >
>> > >>> > > Does that sound good to everyone? If there are no objections,
>> I'll
>> > >>> > > make the change to the draft next week.
>> > >>> > >
>> > >>> > > Thanks,
>> > >>> > > Hans
>> > >>> > >
>> > >>> > > On Wed, Aug 15, 2012 at 5:29 PM, Conversational
>> > >>> > > <dahl@conversational-technologies.com> wrote:
>> > >>> > >> I can't check the spec right now, but I assume there's already
>> an
>> > >>> > >> attribute
>> > >>> > that currently is defined to contain the raw text. So I think we
>> could
>> > >>> > say that
>> > >>> > if there's no interpretation the value of the interpretation
>> attribute
>> > >>> > would be
>> > >>> > the same as the value of the "raw string" attribute,
>> > >>> > >>
>> > >>> > >> Sent from my iPhone
>> > >>> > >>
>> > >>> > >> On Aug 15, 2012, at 9:57 AM, Hans Wennborg
>> > <hwennborg@google.com>
>> > >>> > wrote:
>> > >>> > >>
>> > >>> > >>> OK, that would work I suppose.
>> > >>> > >>>
>> > >>> > >>> What would the spec text look like? Something like "[...] If
>> no
>> > >>> > >>> semantic meaning can be determined, the attribute will a
>> string
>> > >>> > >>> representing the raw words that the user spoke."?
>> > >>> > >>>
>> > >>> > >>> On Wed, Aug 15, 2012 at 2:24 PM, Bjorn Bringert
>> > >>> > <bringert@google.com> wrote:
>> > >>> > >>>> Yeah, that would be my preference too.
>> > >>> > >>>>
>> > >>> > >>>> On Wed, Aug 15, 2012 at 2:19 PM, Conversational
>> > >>> > >>>> <dahl@conversational-technologies.com> wrote:
>> > >>> > >>>>> If there isn't an interpretation I think it would make the
>> most
>> > >>> > >>>>> sense
>> > >>> > for the attribute to contain the literal string result. I believe
>> this
>> > >>> > is what
>> > >>> > happens in VoiceXML.
>> > >>> > >>>>>
>> > >>> > >>>>>> My question is: for implementations that cannot provide an
>> > >>> > >>>>>> interpretation, what should the attribute's value be? null?
>> > >>> > undefined?
>> > >>> >
>> > >>> >
>> > >>> >
>> > >>> > --
>> > >>> > Bjorn Bringert
>> > >>> > Google UK Limited, Registered Office: Belgrave House, 76
>> Buckingham
>> > >>> > Palace Road, London, SW1W 9TQ
>> > >>> > Registered in England Number: 3977902
>> > >>>
>> > >>>
>> > >>>
>> > >>
>> > >>
>> > >****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>> ** **
>>
>
>
Received on Wednesday, 12 September 2012 00:34:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:02:28 UTC