Re: R27. Grammars, TTS, media composition, and recognition results should all use standard formats

I haven't used EMMA, but it looks like it could be a bit complex for a
script to simply get the top utterance or interpretation out. Are
there any shorthands or DOM methods for this? Any Hello World examples
to show the basic usage?

/Bjorn

On Mon, Oct 25, 2010 at 1:38 PM, Dan Burnett <dburnett@voxeo.com> wrote:
> +1
> On Oct 22, 2010, at 2:57 PM, Michael Bodell wrote:
>
>> I agree that SRGS, SISR, EMMA, and SSML seems like the obvious W3C
>> standard formats that we should use.
>>
>> -----Original Message-----
>> From: public-xg-htmlspeech-request@w3.org
>> [mailto:public-xg-htmlspeech-request@w3.org] On Behalf Of Deborah Dahl
>> Sent: Friday, October 22, 2010 6:39 AM
>> To: 'Bjorn Bringert'; 'Dan Burnett'
>> Cc: public-xg-htmlspeech@w3.org
>> Subject: RE: R27. Grammars, TTS, media composition, and recognition
>> results should all use standard formats
>>
>> For recognition results, EMMA http://www.w3.org/TR/2009/REC-emma-20090210/
>> is a much more recent and more complete standard than NLSML. EMMA has a very
>> rich set of capabilities, but most of them are optional, so that using it
>> doesn't have to be complex. Quite a few recognizers support it. I think one
>> of the most valuable aspects of EMMA is that as applications eventually
>> start finding that they need more and more information about the recognition
>> result, much of that more advanced information has already been worked out
>> and standardized in EMMA.
>>
>>> -----Original Message-----
>>> From: public-xg-htmlspeech-request@w3.org
>>> [mailto:public-xg-htmlspeech- request@w3.org] On Behalf Of Bjorn
>>> Bringert
>>> Sent: Friday, October 22, 2010 7:01 AM
>>> To: Dan Burnett
>>> Cc: public-xg-htmlspeech@w3.org
>>> Subject: Re: R27. Grammars, TTS, media composition, and recognition
>>> results should all use standard formats
>>>
>>> For grammars, SRGS + SISR seems like the obvious choice.
>>>
>>> For TTS, SSML seems like the obvious choice.
>>>
>>> I'm not exactly what is meant by media composition here. Is it using
>>> TTS output together with other media? Is there a use case for this?
>>> And is there anything we need to specify here at all?
>>>
>>> For recognition results, there is NLSML, but as far as I can tell,
>>> that hasn't been widely adopted. Also, it seems like it could be a bit
>>> complex for web applications to process.
>>>
>>> /Bjorn
>>>
>>> On Fri, Oct 22, 2010 at 1:06 AM, Dan Burnett <dburnett@voxeo.com> wrote:
>>>>
>>>> Group,
>>>>
>>>> This is the second of the requirements to discuss and prioritize
>>>> based our ranking approach [1].
>>>>
>>>> This email is the beginning of a thread for questions, discussion,
>>>> and opinions regarding our first draft of Requirement 27 [2].
>>>>
>>>> After our discussion and any modifications to the requirement, our
>>>> goal is to prioritize this requirement as either "Should Address" or
>>>> "For Future Consideration".
>>>>
>>>> -- dan
>>>>
>>>> [1]
>>>> http://lists.w3.org/Archives/Public/public-xg-
>>>
>>> htmlspeech/2010Oct/0024.html
>>>>
>>>> [2]
>>>> http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Oct/att
>>>> -
>>>
>>> 0001/speech.html#r27
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Bjorn Bringert
>>> Google UK Limited, Registered Office: Belgrave House, 76 Buckingham
>>> Palace Road, London, SW1W 9TQ Registered in England Number: 3977902
>>
>>
>>
>>
>
>



-- 
Bjorn Bringert
Google UK Limited, Registered Office: Belgrave House, 76 Buckingham
Palace Road, London, SW1W 9TQ
Registered in England Number: 3977902

Received on Monday, 25 October 2010 12:43:03 UTC