- From: Young, Milan <Milan.Young@nuance.com>
- Date: Tue, 12 Jun 2012 16:35:31 +0000
- To: Hans Wennborg <hwennborg@google.com>
- CC: Deborah Dahl <dahl@conversational-technologies.com>, Satish S <satish@google.com>, "olli@pettay.fi" <olli@pettay.fi>, Bjorn Bringert <bringert@google.com>, Glen Shires <gshires@google.com>, "public-speech-api@w3.org" <public-speech-api@w3.org>
I had suggested that we add a link to the use cases. Do we want to capture those in the spec or another document? Thanks -----Original Message----- From: Hans Wennborg [mailto:hwennborg@google.com] Sent: Tuesday, June 12, 2012 8:56 AM To: Young, Milan Cc: Deborah Dahl; Satish S; olli@pettay.fi; Bjorn Bringert; Glen Shires; public-speech-api@w3.org Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's draft posted) Thanks all! I've updated the spec to add the emma attribute: http://dvcs.w3.org/hg/speech-api/rev/ae432e2c84f7 - Hans On Tue, Jun 12, 2012 at 4:41 PM, Young, Milan <Milan.Young@nuance.com> wrote: > I'm also happy with the new text. I suggest we add a link to an > appendix or something for the use cases. At present we have: > > > > Use case 1: I'm testing different speech recognition services. I would > like to know which service processed the speech associated with a > particular result, so that I can compare the services for accuracy. I > can use the emma:process parameter for that. > > > > Use case 2: I want the system to dynamically slow down its TTS for > users who speak more slowly. The EMMA timestamps, duration, and token > parameters can be used to determine the speech rate for a particular utterance. > > > > Use case 3: I'm testing several different grammars to compare their > accuracy. I use the emma:grammar parameter to record which grammar was > used for each result. > > > > Use case 4: My application server is based on an MMI architecture > (link http://www.w3.org/TR/mmi-arch/), and uses EMMA documents to > communicate results. I POST the EMMA results to the server in order > to derive a next state in the dialog. > > > > Thanks > > > > > > From: Deborah Dahl [mailto:dahl@conversational-technologies.com] > > Sent: Tuesday, June 12, 2012 8:32 AM > To: 'Satish S'; Young, Milan > Cc: 'Hans Wennborg'; olli@pettay.fi; 'Bjorn Bringert'; 'Glen Shires'; > public-speech-api@w3.org > Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's > draft > posted) > > > > That seems good to me. On the point of the UA modifying the EMMA, I > think it's ok if we prohibit the UA from modifying the EMMA, because > the application can certainly make its own modifications if it wants > extra information added, for example, for logging purposes, or to > attach the name of the application, or whatever. > > If we want to propose to the MMIWG that some EMMA attributes should be > obligatory to support specific use cases, we can do that. I think we > might want to wait before we put together a proposal, though, in case > we think of other use cases. > > > > From: Satish S [mailto:satish@google.com] > Sent: Tuesday, June 12, 2012 11:09 AM > To: Young, Milan > Cc: Deborah Dahl; Hans Wennborg; olli@pettay.fi; Bjorn Bringert; Glen > Shires; public-speech-api@w3.org > Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's > draft > posted) > > > > Thanks Milan and Deborah. Looks like we agree on the following language. > Could you confirm? > > > Section 5.1: > readonly attribute Document emma; > > Section 5.1.6 needs > emma - EMMA 1.0 (link to http://www.w3.org/TR/emma/) representation > of this result. The contents of this result could vary across UAs and > recognition engines, but all implementations MUST expose a valid XML > document complete with EMMA namespace. UA implementations for > recognizers that supply EMMA MUST pass that EMMA structure directly. > > > > Cheers > Satish > > On Tue, Jun 12, 2012 at 3:59 PM, Young, Milan <Milan.Young@nuance.com> > wrote: > > I'm also fine with dropping the specific attributes and instead > attaching a set of EMMA use cases. > > > > But I'm wary of the UA modifying the EMMA document: 1) This is > starting to get into non-trivial domains with encodings and such, 2) > The application could easily attach UA information to the log. > > > > > > > > From: Deborah Dahl [mailto:dahl@conversational-technologies.com] > Sent: Tuesday, June 12, 2012 7:51 AM > > > To: 'Satish S'; Young, Milan > Cc: 'Hans Wennborg'; olli@pettay.fi; 'Bjorn Bringert'; 'Glen Shires'; > public-speech-api@w3.org > Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's > draft > posted) > > > > I'm not sure why a web developer would care whether the EMMA they get > from the UA is exactly what the speech recognizer supplied. On the > other hand, I can think of useful things that the UA could add to the > EMMA, for example, something in the <info> tag about the UA that the > request originated from, that the recognizer wouldn't necessarily know > about. In that case you might actually want modified EMMA. > > I agree with Satish's point that we might think of other use cases > that require specific EMMA attributes, so I don't really see the need > to call out those specific attributes. > > > > From: Satish S [mailto:satish@google.com] > Sent: Tuesday, June 12, 2012 5:22 AM > To: Young, Milan > Cc: Deborah Dahl; Hans Wennborg; olli@pettay.fi; Bjorn Bringert; Glen > Shires; public-speech-api@w3.org > Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's > draft > posted) > > > > I believe it is more useful for web developers if the UA is required > to passed through the EMMA structure from recognizer as is, so they > can rest assured the UA doesn't modify what the recognizer sends. To > that effect, here is a modified proposal (version 4) based on Milan's version 3: > > --------------- > Section 5.1: > readonly attribute Document emma; > > Section 5.1.6 needs > emma - EMMA 1.0 (link to http://www.w3.org/TR/emma/) representation > of this result. The contents of this result could vary across UAs and > recognition engines, but all implementations MUST expose a valid XML > document complete with EMMA namespace. > - UA implementations for recognizers that supply EMMA MUST pass that > EMMA structure directly. > - UA implementations for recognizers that do not supply EMMA SHOULD > expose the following: > * <emma:interpretation> tag(s) populated with the interpretation (e.g. > emma:literal or slot values) > * The following attributes on the <emma:interpretation> tag: id, > emma:process, emma:tokens, emma:medium, emma:mode. > --------------- > > Milan, the list of attributes mentioned in the last bullet has been > gathered from the use cases mentioned in this thread. This list can > change if we think of more use cases going forward. So should we even > list them at all or since the first point has the MUST clause is that sufficient? > > Cheers > > Satish > > On Mon, Jun 11, 2012 at 7:38 PM, Young, Milan <Milan.Young@nuance.com> > wrote: > > Is there consensus on the following (version 3) proposal: > > > > Section 5.1: > > readonly attribute Document emma; > > > > Section 5.1.6 needs > > emma - EMMA 1.0 (link to http://www.w3.org/TR/emma/) representation > of this result. The contents of this result could vary across UAs and > recognition engines, but all implementations MUST expose a valid XML > document complete with EMMA namespace. Implementations SHOULD expose > the > following: > > * <emma:interpretation> tag(s) populated with the interpretation (e.g. > emma:literal or slot values) > > * The following attributes on the <emma:interpretation> tag: id, > emma:process, emma:tokens, emma:medium, emma:mode. > > > > Thanks > > > > From: Deborah Dahl [mailto:dahl@conversational-technologies.com] > Sent: Monday, June 11, 2012 11:29 AM > > > To: 'Satish S'; Young, Milan > Cc: 'Hans Wennborg'; olli@pettay.fi; 'Bjorn Bringert'; 'Glen Shires'; > public-speech-api@w3.org > Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's > draft > posted) > > > > Glenn pointed out to me offline that Satish was asking about whether > the attributes that are required for the use cases we've been > discussing are required in EMMA 1.0. I have to admit that I've lost > track of what use cases we're talking about, but I think at least 3 of > them are listed in http://lists.w3.org/Archives/Public/public-speech-api/2012May/0037.html . > Those use cases require "emma:process", the timestamps, and > "emma:grammar", which are not required in EMMA 1.0. The other use case > we might be talking about is described in > http://lists.w3.org/Archives/Public/public-speech-api/2012Apr/0014.htm > l, where an existing dialog manager or logger expects to receive > speech recognition results as an EMMA document, in which case no > specific attributes are required. > > > > From: Deborah Dahl [mailto:dahl@conversational-technologies.com] > Sent: Monday, June 11, 2012 1:40 PM > To: 'Satish S'; 'Young, Milan' > Cc: 'Hans Wennborg'; olli@pettay.fi; 'Bjorn Bringert'; 'Glen Shires'; > public-speech-api@w3.org > Subject: RE: EMMA in Speech API (was RE: Speech API: first editor's > draft > posted) > > > > Hi Satish, > > All of the EMMA attributes that have been proposed for the use cases > we've discussed are already part of the EMMA 1.0 standard. That said, > the Multimodal Interaction Working Group is always interested in > receiving comments and suggestions that relate to possible new EMMA > capabilities, which can be posted to www-multimodal@w3.org. > > Regards, > > Debbie > > > > From: Satish S [mailto:satish@google.com] > Sent: Monday, June 11, 2012 12:18 PM > To: Young, Milan > Cc: Hans Wennborg; Deborah Dahl; olli@pettay.fi; Bjorn Bringert; Glen > Shires; public-speech-api@w3.org > Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's > draft > posted) > > > > If there are EMMA attributes that are mandatory for specific use > cases, we should post to the MMI WG and get those changes into the > EMMA recommendation published at http://www.w3.org/TR/emma/. I'm sure > they will be interested in incorporating them and Deborah Dahl can > help as well since she is one of the authors. > > > > Cheers > Satish > > On Mon, Jun 11, 2012 at 4:16 PM, Young, Milan <Milan.Young@nuance.com> > wrote: > > Hello Hans, > > I did respond to this thread, but it got forked. The upshot is that > we should go with my second (most recent) proposal, not my first > proposal (that Satish supported). The reason is that the first > proposal did not allow us to achieve the interoperability use cases that Deborah put forward. > > To addresses Satish's most recent argument, the likely hood of an > application failing because the EMMA result contains an extra couple > attributes is small. This is because 1) most EMMA implementations > support these attributes already, 2) we're dealing with XML which > abstracts low-level parsing, 3) If an application did fail, the fix would be trivial. > > Thanks > > > > -----Original Message----- > From: Hans Wennborg [mailto:hwennborg@google.com] > > Sent: Monday, June 11, 2012 2:56 AM > To: Deborah Dahl > > Cc: Satish S; olli@pettay.fi; Young, Milan; Bjorn Bringert; Glen > Shires; public-speech-api@w3.org > Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's > draft > posted) > > Do we have agreement on this? If there are no objections, I'll update > the spec with the text Satish posted on the 8th (with DOMString > substituted with > Document): > > ---- > Addition to SpeechRecognitionResult (section 5.1) > > readonly attribute Document emma; > > And the corresponding addition to 5.1.6: > emma - A string representation of the XML-based <link>EMMA 1.0</link> > result. (link points to http://www.w3.org/TR/emma/ > ---- > > Thanks, > Hans > > On Fri, Jun 8, 2012 at 2:32 PM, Deborah Dahl > <dahl@conversational-technologies.com> wrote: >> I agree that Document would be more useful. >> >> >> >> From: Satish S [mailto:satish@google.com] >> Sent: Friday, June 08, 2012 5:18 AM >> To: Hans Wennborg >> Cc: olli@pettay.fi; Young, Milan; Deborah Dahl; Bjorn Bringert; Glen >> Shires; public-speech-api@w3.org >> >> >> Subject: Re: EMMA in Speech API (was RE: Speech API: first editor's >> draft >> posted) >> >> >> >> Yes that is correct, it should be >> >> readonly attribute Document emma; >> >> >> Cheers >> Satish >> >> On Fri, Jun 8, 2012 at 10:04 AM, Hans Wennborg <hwennborg@google.com> >> wrote: >> >> On Fri, Jun 8, 2012 at 12:31 AM, Satish S <satish@google.com> wrote: >>> In any case, looks like there is enough interest both from speech & >>> browser vendors to have this attribute always non-null. So I'm fine >>> making it so. >>> I >>> like the first proposal from Milan: >>> ---- >>> Addition to SpeechRecognitionResult (section 5.1) >>> >>> readonly attribute DOMString emma; >>> >>> And the corresponding addition to 5.1.6: >>> emma - A string representation of the XML-based <link>EMMA >>> 1.0</link> result. (link points to http://www.w3.org/TR/emma/ >>> ---- >>> >>> This spec proposal shouldn't mandate specific fields any more than >>> what EMMA does already so that web apps can point to existing >>> recognizers and get EMMA data in the same format as they would get >>> otherwise. >> >> Earlier in the thread, I thought we decided that it was better to >> make the emma attribute be of type Document rather than DOMString? > > > > > >
Received on Tuesday, 12 June 2012 16:36:08 UTC