RE: EMMA in Speech API (was RE: Speech API: first editor's draft posted) from Deborah Dahl on 2012-06-12 (public-speech-api@w3.org from June 2012)

From: Deborah Dahl <dahl@conversational-technologies.com>
Date: Tue, 12 Jun 2012 10:51:04 -0400
To: "'Satish S'" <satish@google.com>, "'Young, Milan'" <Milan.Young@nuance.com>
Cc: "'Hans Wennborg'" <hwennborg@google.com>, <olli@pettay.fi>, "'Bjorn Bringert'" <bringert@google.com>, "'Glen Shires'" <gshires@google.com>, <public-speech-api@w3.org>
Message-ID: <00aa01cd48aa$cb5ccff0$62166fd0$@conversational-technologies.com>
I'm not sure why a web developer would care whether the EMMA they get from
the UA is exactly what the speech recognizer supplied. On the other hand, I
can think of useful things that the UA could add to the EMMA, for example,
something in the <info> tag about the UA  that the request originated from,
that the recognizer wouldn't necessarily know about. In that case you might
actually want modified EMMA.

I agree with Satish's point that we might think of other use cases that
require specific EMMA attributes, so I don't really see the need to call out
those specific attributes.

 

From: Satish S [mailto:satish@google.com] 
Sent: Tuesday, June 12, 2012 5:22 AM
To: Young, Milan
Cc: Deborah Dahl; Hans Wennborg; olli@pettay.fi; Bjorn Bringert; Glen
Shires; public-speech-api@w3.org
Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's draft
posted)

 

I believe it is more useful for web developers if the UA is required to
passed through the EMMA structure from recognizer as is, so they can rest
assured the UA doesn't modify what the recognizer sends. To that effect,
here is a modified proposal (version 4) based on Milan's version 3:

---------------
Section 5.1:
  readonly attribute Document emma;

Section 5.1.6 needs
  emma - EMMA 1.0 (link to http://www.w3.org/TR/emma/) representation of
this result.  The contents of this result could vary across UAs and
recognition engines, but all implementations MUST expose a valid XML
document complete with EMMA namespace.
- UA implementations for recognizers that supply EMMA MUST pass that EMMA
structure directly.
- UA implementations for recognizers that do not supply EMMA SHOULD expose
the following:
 * <emma:interpretation> tag(s) populated with the interpretation (e.g.
emma:literal or slot values)
 * The following attributes on the <emma:interpretation> tag: id,
emma:process, emma:tokens, emma:medium, emma:mode.
---------------

Milan, the list of attributes mentioned in the last bullet has been gathered
from the use cases mentioned in this thread. This list can change if we
think of more use cases going forward. So should we even list them at all or
since the first point has the MUST clause is that sufficient?

Cheers

Satish



On Mon, Jun 11, 2012 at 7:38 PM, Young, Milan <Milan.Young@nuance.com>
wrote:

Is there consensus on the following (version 3) proposal:

 

Section 5.1:

  readonly attribute Document emma;

 

Section 5.1.6 needs

  emma - EMMA 1.0 (link to http://www.w3.org/TR/emma/) representation of
this result.  The contents of this result could vary across UAs and
recognition engines, but all implementations MUST expose a valid XML
document complete with EMMA namespace.  Implementations SHOULD expose the
following:

  * <emma:interpretation> tag(s) populated with the interpretation (e.g.
emma:literal or slot values)

  * The following attributes on the <emma:interpretation> tag: id,
emma:process, emma:tokens, emma:medium, emma:mode.

 

Thanks

 

From: Deborah Dahl [mailto:dahl@conversational-technologies.com] 
Sent: Monday, June 11, 2012 11:29 AM


To: 'Satish S'; Young, Milan
Cc: 'Hans Wennborg'; olli@pettay.fi; 'Bjorn Bringert'; 'Glen Shires';
public-speech-api@w3.org
Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's draft
posted)

 

Glenn pointed out to me offline that Satish was asking about whether the
attributes that are required for the use cases we've been discussing are
required in EMMA 1.0.  I have to admit that I've lost track of what use
cases we're talking about, but I think at least 3 of them are listed in
http://lists.w3.org/Archives/Public/public-speech-api/2012May/0037.html .
Those use cases require "emma:process", the timestamps, and "emma:grammar",
which are not required in EMMA 1.0. The other use case we might be talking
about is described in
http://lists.w3.org/Archives/Public/public-speech-api/2012Apr/0014.html,
where an existing dialog manager or logger expects to receive speech
recognition results as an EMMA document, in which case no specific
attributes are required.

 

From: Deborah Dahl [mailto:dahl@conversational-technologies.com] 
Sent: Monday, June 11, 2012 1:40 PM
To: 'Satish S'; 'Young, Milan'
Cc: 'Hans Wennborg'; olli@pettay.fi; 'Bjorn Bringert'; 'Glen Shires';
public-speech-api@w3.org
Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's draft
posted)

 

Hi Satish,

All of the EMMA attributes that have been proposed for the use cases we've
discussed are already part of the EMMA 1.0 standard. That said, the
Multimodal Interaction Working Group is always interested in receiving
comments and suggestions that relate to possible new EMMA capabilities,
which can be posted to www-multimodal@w3.org. 

Regards,

Debbie

 

From: Satish S [mailto:satish@google.com] 
Sent: Monday, June 11, 2012 12:18 PM
To: Young, Milan
Cc: Hans Wennborg; Deborah Dahl; olli@pettay.fi; Bjorn Bringert; Glen
Shires; public-speech-api@w3.org
Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's draft
posted)

 

If there are EMMA attributes that are mandatory for specific use cases, we
should post to the MMI WG and get those changes into the EMMA recommendation
published at http://www.w3.org/TR/emma/. I'm sure they will be interested in
incorporating them and Deborah Dahl can help as well since she is one of the
authors.

 

Cheers
Satish

On Mon, Jun 11, 2012 at 4:16 PM, Young, Milan <Milan.Young@nuance.com>
wrote:

Hello Hans,

I did respond to this thread, but it got forked.  The upshot is that we
should go with my second (most recent) proposal, not my first proposal (that
Satish supported).  The reason is that the first proposal did not allow us
to achieve the interoperability use cases that Deborah put forward.

To addresses Satish's most recent argument, the likely hood of an
application failing because the EMMA result contains an extra couple
attributes is small.  This is because 1) most EMMA implementations support
these attributes already, 2) we're dealing with XML which abstracts
low-level parsing, 3) If an application did fail, the fix would be trivial.

Thanks



-----Original Message-----
From: Hans Wennborg [mailto:hwennborg@google.com]

Sent: Monday, June 11, 2012 2:56 AM
To: Deborah Dahl

Cc: Satish S; olli@pettay.fi; Young, Milan; Bjorn Bringert; Glen Shires;
public-speech-api@w3.org
Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's draft
posted)

Do we have agreement on this? If there are no objections, I'll update the
spec with the text Satish posted on the 8th (with DOMString substituted with
Document):

----
Addition to SpeechRecognitionResult (section 5.1)

 readonly attribute Document emma;

And the corresponding addition to 5.1.6:
 emma - A string representation of the XML-based <link>EMMA 1.0</link>
result. (link points to http://www.w3.org/TR/emma/
----

Thanks,
Hans

On Fri, Jun 8, 2012 at 2:32 PM, Deborah Dahl
<dahl@conversational-technologies.com> wrote:
> I agree that Document would be more useful.
>
>
>
> From: Satish S [mailto:satish@google.com]
> Sent: Friday, June 08, 2012 5:18 AM
> To: Hans Wennborg
> Cc: olli@pettay.fi; Young, Milan; Deborah Dahl; Bjorn Bringert; Glen
> Shires; public-speech-api@w3.org
>
>
> Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's
> draft
> posted)
>
>
>
> Yes that is correct, it should be
>
>   readonly attribute Document emma;
>
>
> Cheers
> Satish
>
> On Fri, Jun 8, 2012 at 10:04 AM, Hans Wennborg <hwennborg@google.com>
wrote:
>
> On Fri, Jun 8, 2012 at 12:31 AM, Satish S <satish@google.com> wrote:
>> In any case, looks like there is enough interest both from speech &
>> browser vendors to have this attribute always non-null. So I'm fine
>> making it so.
>> I
>> like the first proposal from Milan:
>> ----
>> Addition to SpeechRecognitionResult (section 5.1)
>>
>>  readonly attribute DOMString emma;
>>
>> And the corresponding addition to 5.1.6:
>>  emma - A string representation of the XML-based <link>EMMA
>> 1.0</link> result. (link points to http://www.w3.org/TR/emma/
>> ----
>>
>> This spec proposal shouldn't mandate specific fields any more than
>> what EMMA does already so that web apps can point to existing
>> recognizers and get EMMA data in the same format as they would get
>> otherwise.
>
> Earlier in the thread, I thought we decided that it was better to make
> the emma attribute be of type Document rather than DOMString?
Received on Tuesday, 12 June 2012 14:51:44 UTC