RE: EMMA in Speech API (was RE: Speech API: first editor's draft posted) from Deborah Dahl on 2012-05-30 (public-speech-api@w3.org from May 2012)

From: Deborah Dahl <dahl@conversational-technologies.com>
Date: Wed, 30 May 2012 13:50:32 -0400
To: "'Young, Milan'" <Milan.Young@nuance.com>, "'Satish S'" <satish@google.com>
Cc: "'Bjorn Bringert'" <bringert@google.com>, "'Glen Shires'" <gshires@google.com>, "'Hans Wennborg'" <hwennborg@google.com>, <public-speech-api@w3.org>
Message-ID: <00bf01cd3e8c$b5c9b090$215d11b0$@conversational-technologies.com>

I think it's still a use case in the sense that having even partial EMMA
would be convenient for the developer, but I don't feel that strongly about
including it, since there are other stronger use cases.

From: Young, Milan [mailto:Milan.Young@nuance.com] 
Sent: Wednesday, May 30, 2012 1:44 PM
To: Deborah Dahl; 'Satish S'
Cc: 'Bjorn Bringert'; 'Glen Shires'; 'Hans Wennborg';
public-speech-api@w3.org
Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's draft
posted)

Thanks Deborah, that's clear.  The upshot is that we don't need to consider
#3 as a use case for this specification.  But #1 and #4 still apply.

Any disagreements, or can I start drafting this for the spec?

From: Deborah Dahl [mailto:dahl@conversational-technologies.com] 
Sent: Wednesday, May 30, 2012 10:10 AM
To: Young, Milan; 'Satish S'
Cc: 'Bjorn Bringert'; 'Glen Shires'; 'Hans Wennborg';
public-speech-api@w3.org
Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's draft
posted)

I agree that use case 3  (comparing grammars) would be most easily achieved
if the recognizer returned the emma:grammar information. However, If I were
implementing use case 3 without getting emma:grammar from the recognizer , I
think I would manually add the "emma:grammar" attribute to the minimal EMMA
provided by the UA (because I know the grammar that I set for the
recognizer). Then I would send the augmented EMMA off to the logging/tuning
server for later analysis. Even though there's a manual step involved, it
would be convenient to be able to add to existing EMMA rather than to
construct the whole EMMA manually.

From: Young, Milan [mailto:Milan.Young@nuance.com] 
Sent: Wednesday, May 30, 2012 11:37 AM
To: Satish S
Cc: Bjorn Bringert; Deborah Dahl; Glen Shires; Hans Wennborg;
public-speech-api@w3.org
Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's draft
posted)

I'm suggesting that if the UA doesn't integrate with a speech engine that
supports EMMA, that it must provide a wrapper so that basic interoperability
can be achieved.  In use case #1 (comparing speech engines), that means
injecting an <emma:process> tag that contains the name of the underlying
speech engine.

I agree that use case #3 could not be achieved without a tight coupling with
the engine.  If Deborah is OK with dropping this, so am I.

I don't understand your point about use case #4.  Earlier you were arguing
for a null/undefined value if the speech engine didn't natively support
EMMA.  Obviously this would prevent the suggested use case.

From: Satish S [mailto:satish@google.com] 
Sent: Wednesday, May 30, 2012 8:19 AM
To: Young, Milan
Cc: Bjorn Bringert; Deborah Dahl; Glen Shires; Hans Wennborg;
public-speech-api@w3.org
Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's draft
posted)

Satish, please take a look at the use cases below.  Items #1 and #3 cannot
be achieved unless EMMA is always present.

To clarify, are you suggesting that speech recognizers must always return
EMMA to the UA, or are you suggesting if they don't the UA should create a
wrapper EMMA object with just the utterance(s) and give that to the web
page? If it is the latter then #1 and #3 can't be achieved anyway because
the UA doesn't have enough information to create an EMMA wrapper with all
possible data that the web app may want (specifically it wouldn't know about
what to put in the emma:process and emma:fields given in those use cases).
And if it is the former that seems out of scope of this CG.

I'd like to add another use case #4.  Application needs to post the
recognition result to server before proceeding in the dialog.  The server
might be a traditional application server or it could be the controller in
an MMI architecture.  EMMA is a standard serialized representation.

If the server supports EMMA then my proposal should work because the web app
would be receiving the EMMA Document as is.

--

Cheers

Satish

Received on Wednesday, 30 May 2012 17:51:05 UTC