W3C home > Mailing lists > Public > public-speech-api@w3.org > June 2012

RE: EMMA in Speech API (was RE: Speech API: first editor's draft posted)

From: Deborah Dahl <dahl@conversational-technologies.com>
Date: Thu, 7 Jun 2012 10:57:55 -0400
To: "'Hans Wennborg'" <hwennborg@google.com>, "'Young, Milan'" <Milan.Young@nuance.com>
Cc: "'Satish S'" <satish@google.com>, "'Bjorn Bringert'" <bringert@google.com>, "'Glen Shires'" <gshires@google.com>, <public-speech-api@w3.org>
Message-ID: <011201cd44bd$ec804770$c580d650$@conversational-technologies.com>


> -----Original Message-----
> From: Hans Wennborg [mailto:hwennborg@google.com]
> Sent: Thursday, June 07, 2012 9:53 AM
> To: Young, Milan
> Cc: Deborah Dahl; Satish S; Bjorn Bringert; Glen Shires; public-speech-
> api@w3.org
> Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's draft
> posted)
> 
> I still don't think UAs that use a speech engine that doesn't support
> EMMA should be required to provide a non-null emma attribute.
> 
> I don't think the vast majority of web developers will care about this.
> 
> For existing applications that rely on EMMA, there would already be
> significant work involved to port to the web and this API. For those
> cases, checking for the null-case, and wrapping the results into EMMA
> using JavaScript shouldn't be a big deal.

If wrapping the results in EMMA could be done once, in the API
implementation, it would be much easier for developers who want EMMA to just
be able to ask for it than to have to check for null and wrap the results in
EMMA every single time they get a speech recognition result. Allowing a null
result also introduces an inconsistency in UA behavior without any obvious
benefit except for saving some minor effort in the API implementation. 

> 
> If there turns out to be a large demand from real web apps for the
> attribute to always be non-null, it would be easy to change the spec
> to require that. Doing it the other way around, allowing web apps to
> rely on it now, and then change it to sometimes return null would be
> much harder.

I don't know why it would ever be useful to go back to a situation where the
API sometimes returns null. 
> 
> Thanks,
> Hans
> 
> 
> On Wed, Jun 6, 2012 at 9:14 PM, Young, Milan <Milan.Young@nuance.com>
> wrote:
> > Since there are no objections, I suggest the following be added to the
spec:
> >
> >
> >
> > Section 5.1:
> >
> >   readonly attribute Document emma;
> >
> >
> >
> > Section 5.1.6 needs
> >
> >   emma – EMMA 1.0 (link to http://www.w3.org/TR/emma/)
> representation of
> > this result.  The contents of this result could vary across UAs and
> > recognition engines, but all implementations MUST at least expose the
> > following:
> >
> > ·       Valid XML document complete with EMMA namespace
> >
> > ·       <emma:interpretation> tag(s) populated with the interpretation
(e.g.
> > emma:literal or slot values) and the following attributes: id,
emma:process,
> > emma:tokens, emma:medium, emma:mode.
> >
> >
> >
> > Thanks
> >
> >
> >
> >
> >
> > From: Young, Milan
> > Sent: Wednesday, May 30, 2012 10:44 AM
> > To: 'Deborah Dahl'; 'Satish S'
> > Cc: 'Bjorn Bringert'; 'Glen Shires'; 'Hans Wennborg';
> > public-speech-api@w3.org
> >
> >
> > Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's
draft
> > posted)
> >
> >
> >
> > Thanks Deborah, that’s clear.  The upshot is that we don’t need to
consider
> > #3 as a use case for this specification.  But #1 and #4 still apply.
> >
> >
> >
> > Any disagreements, or can I start drafting this for the spec?
> >
> >
> >
> >
> >
> > From: Deborah Dahl [mailto:dahl@conversational-technologies.com]
> >
> > Sent: Wednesday, May 30, 2012 10:10 AM
> > To: Young, Milan; 'Satish S'
> > Cc: 'Bjorn Bringert'; 'Glen Shires'; 'Hans Wennborg';
> > public-speech-api@w3.org
> >
> > Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's
draft
> > posted)
> >
> >
> >
> > I agree that use case 3  (comparing grammars) would be most easily
> achieved
> > if the recognizer returned the emma:grammar information. However, If I
> were
> > implementing use case 3 without getting emma:grammar from the
> recognizer , I
> > think I would manually add the “emma:grammar” attribute to the minimal
> EMMA
> > provided by the UA (because I know the grammar that I set for the
> > recognizer). Then I would send the augmented EMMA off to the
> logging/tuning
> > server for later analysis. Even though there’s a manual step involved,
it
> > would be convenient to be able to add to existing EMMA rather than to
> > construct the whole EMMA manually.
> >
> >
> >
> > From: Young, Milan [mailto:Milan.Young@nuance.com]
> > Sent: Wednesday, May 30, 2012 11:37 AM
> > To: Satish S
> > Cc: Bjorn Bringert; Deborah Dahl; Glen Shires; Hans Wennborg;
> > public-speech-api@w3.org
> > Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's
draft
> > posted)
> >
> >
> >
> > I’m suggesting that if the UA doesn’t integrate with a speech engine
that
> > supports EMMA, that it must provide a wrapper so that basic
> interoperability
> > can be achieved.  In use case #1 (comparing speech engines), that means
> > injecting an <emma:process> tag that contains the name of the underlying
> > speech engine.
> >
> >
> >
> > I agree that use case #3 could not be achieved without a tight coupling
with
> > the engine.  If Deborah is OK with dropping this, so am I.
> >
> >
> >
> > I don’t understand your point about use case #4.  Earlier you were
arguing
> > for a null/undefined value if the speech engine didn’t natively support
> > EMMA.  Obviously this would prevent the suggested use case.
> >
> >
> >
> >
> >
> >
> >
> > From: Satish S [mailto:satish@google.com]
> > Sent: Wednesday, May 30, 2012 8:19 AM
> > To: Young, Milan
> > Cc: Bjorn Bringert; Deborah Dahl; Glen Shires; Hans Wennborg;
> > public-speech-api@w3.org
> > Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's
draft
> > posted)
> >
> >
> >
> > Satish, please take a look at the use cases below.  Items #1 and #3
cannot
> > be achieved unless EMMA is always present.
> >
> >
> >
> > To clarify, are you suggesting that speech recognizers must always
return
> > EMMA to the UA, or are you suggesting if they don't the UA should create
> a
> > wrapper EMMA object with just the utterance(s) and give that to the web
> > page? If it is the latter then #1 and #3 can't be achieved anyway
because
> > the UA doesn't have enough information to create an EMMA wrapper with
> all
> > possible data that the web app may want (specifically it wouldn't know
> about
> > what to put in the emma:process and emma:fields given in those use
> cases).
> > And if it is the former that seems out of scope of this CG.
> >
> >
> >
> > I'd like to add another use case #4.  Application needs to post the
> > recognition result to server before proceeding in the dialog.  The
server
> > might be a traditional application server or it could be the controller
in
> > an MMI architecture.  EMMA is a standard serialized representation.
> >
> >
> >
> > If the server supports EMMA then my proposal should work because the
> web app
> > would be receiving the EMMA Document as is.
> >
> >
> >
> > --
> >
> > Cheers
> >
> > Satish
Received on Thursday, 7 June 2012 14:58:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 7 June 2012 14:58:35 GMT