Re: EMMA in Speech API (was RE: Speech API: first editor's draft posted)

On 06/07/2012 07:12 PM, Young, Milan wrote:
> Perhaps only a small percentage of *developers* are interested in this feature, but I believe that a large percentage of *end-users* will be
> impacted by this feature.  That's because enterprise-grade applications are written by few but used by many.
>
> Every argument that I've heard for discarding this feature boils down to implementation.  Given that implementation is trivial, this sounds like an
> abuse of the community structure we are based on.  If we do not have a resolution to add this feature by this weekend, I will escalate to the W3C
> staff.

It is totally ok to me to require that if the speech service doesn't provide EMMA, UA wraps the result in some
simple EMMA. That way the API stays consistent - some kind of EMMA document is always available.




-Olli


>
>
>
>
>
> -----Original Message----- From: Olli Pettay [mailto:Olli.Pettay@helsinki.fi] Sent: Thursday, June 07, 2012 8:27 AM To: Hans Wennborg Cc: Young,
> Milan; Deborah Dahl; Satish S; Bjorn Bringert; Glen Shires; public-speech-api@w3.org Subject: Re: EMMA in Speech API (was RE: Speech API: first
> editor's draft posted)
>
> On 06/07/2012 04:52 PM, Hans Wennborg wrote:
>> I still don't think UAs that use a speech engine that doesn't support EMMA should be required to provide a non-null emma attribute.
>>
>> I don't think the vast majority of web developers will care about this.
>>
>> For existing applications that rely on EMMA, there would already be significant work involved to port to the web and this API. For those cases,
>> checking for the null-case, and wrapping the results into EMMA using JavaScript shouldn't be a big deal.
>>
>> If there turns out to be a large demand from real web apps for the attribute to always be non-null, it would be easy to change the spec to
>> require that. Doing it the other way around, allowing web apps to rely on it now, and then change it to sometimes return null would be much
>> harder.
>>
>> Thanks, Hans
>
> It makes no sense to have this kind of optional features. Either EMMA must be there or it must not (either one is ok to me).
>
>
> -Olli
>
>
>
>
>>
>>
>> On Wed, Jun 6, 2012 at 9:14 PM, Young, Milan <Milan.Young@nuance.com> wrote:
>>> Since there are no objections, I suggest the following be added to the spec:
>>>
>>>
>>>
>>> Section 5.1:
>>>
>>> readonly attribute Document emma;
>>>
>>>
>>>
>>> Section 5.1.6 needs
>>>
>>> emma - EMMA 1.0 (link to http://www.w3.org/TR/emma/) representation of this result.  The contents of this result could vary across UAs and
>>> recognition engines, but all implementations MUST at least expose the following:
>>>
>>> *       Valid XML document complete with EMMA namespace
>>>
>>> *       <emma:interpretation> tag(s) populated with the interpretation (e.g. emma:literal or slot values) and the following attributes: id,
>>> emma:process, emma:tokens, emma:medium, emma:mode.
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>> From: Young, Milan Sent: Wednesday, May 30, 2012 10:44 AM To: 'Deborah Dahl'; 'Satish S' Cc: 'Bjorn Bringert'; 'Glen Shires'; 'Hans Wennborg';
>>> public-speech-api@w3.org
>>>
>>>
>>> Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's draft posted)
>>>
>>>
>>>
>>> Thanks Deborah, that's clear.  The upshot is that we don't need to consider #3 as a use case for this specification.  But #1 and #4 still
>>> apply.
>>>
>>>
>>>
>>> Any disagreements, or can I start drafting this for the spec?
>>>
>>>
>>>
>>>
>>>
>>> From: Deborah Dahl [mailto:dahl@conversational-technologies.com]
>>>
>>> Sent: Wednesday, May 30, 2012 10:10 AM To: Young, Milan; 'Satish S' Cc: 'Bjorn Bringert'; 'Glen Shires'; 'Hans Wennborg';
>>> public-speech-api@w3.org
>>>
>>> Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's draft posted)
>>>
>>>
>>>
>>> I agree that use case 3  (comparing grammars) would be most easily achieved if the recognizer returned the emma:grammar information. However,
>>> If I were implementing use case 3 without getting emma:grammar from the recognizer , I think I would manually add the "emma:grammar" attribute
>>> to the minimal EMMA provided by the UA (because I know the grammar that I set for the recognizer). Then I would send the augmented EMMA off to
>>> the logging/tuning server for later analysis. Even though there's a manual step involved, it would be convenient to be able to add to existing
>>> EMMA rather than to construct the whole EMMA manually.
>>>
>>>
>>>
>>> From: Young, Milan [mailto:Milan.Young@nuance.com] Sent: Wednesday, May 30, 2012 11:37 AM To: Satish S Cc: Bjorn Bringert; Deborah Dahl; Glen
>>> Shires; Hans Wennborg; public-speech-api@w3.org Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's draft posted)
>>>
>>>
>>>
>>> I'm suggesting that if the UA doesn't integrate with a speech engine that supports EMMA, that it must provide a wrapper so that basic
>>> interoperability can be achieved.  In use case #1 (comparing speech engines), that means injecting an <emma:process> tag that contains the name
>>> of the underlying speech engine.
>>>
>>>
>>>
>>> I agree that use case #3 could not be achieved without a tight coupling with the engine.  If Deborah is OK with dropping this, so am I.
>>>
>>>
>>>
>>> I don't understand your point about use case #4.  Earlier you were arguing for a null/undefined value if the speech engine didn't natively
>>> support EMMA.  Obviously this would prevent the suggested use case.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> From: Satish S [mailto:satish@google.com] Sent: Wednesday, May 30, 2012 8:19 AM To: Young, Milan Cc: Bjorn Bringert; Deborah Dahl; Glen Shires;
>>> Hans Wennborg; public-speech-api@w3.org Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's draft posted)
>>>
>>>
>>>
>>> Satish, please take a look at the use cases below.  Items #1 and #3 cannot be achieved unless EMMA is always present.
>>>
>>>
>>>
>>> To clarify, are you suggesting that speech recognizers must always return EMMA to the UA, or are you suggesting if they don't the UA should
>>> create a wrapper EMMA object with just the utterance(s) and give that to the web page? If it is the latter then #1 and #3 can't be achieved
>>> anyway because the UA doesn't have enough information to create an EMMA wrapper with all possible data that the web app may want (specifically
>>> it wouldn't know about what to put in the emma:process and emma:fields given in those use cases). And if it is the former that seems out of
>>> scope of this CG.
>>>
>>>
>>>
>>> I'd like to add another use case #4.  Application needs to post the recognition result to server before proceeding in the dialog.  The server
>>> might be a traditional application server or it could be the controller in an MMI architecture.  EMMA is a standard serialized representation.
>>>
>>>
>>>
>>> If the server supports EMMA then my proposal should work because the web app would be receiving the EMMA Document as is.
>>>
>>>
>>>
>>> --
>>>
>>> Cheers
>>>
>>> Satish
>>
>

Received on Thursday, 7 June 2012 17:47:24 UTC