Re: Lexical Context, Speech Recognition and Synthesis from Glen Shires on 2012-09-07 (public-speech-api@w3.org from September 2012)

From: Glen Shires <gshires@google.com>
Date: Fri, 7 Sep 2012 14:33:29 -0700
To: Adam Sobieski <adamsobieski@hotmail.com>
Cc: "public-speech-api@w3.org" <public-speech-api@w3.org>
Message-ID: <CAEE5bcijfvzADjBjB9uRCvcxngo-0udJ-P4qv+_xN0mFO457Cw@mail.gmail.com>

If I'm understanding correctly, this is possible with the current spec.
 JavaScript code (or for that matter, webserver-side code), can
extract metadata and resources and insert them into the SRGS and SSML.
 Such code would provide a highly flexible, powerful and customizable
solution.

/Glen Shires

On Fri, Sep 7, 2012 at 2:19 PM, Adam Sobieski <adamsobieski@hotmail.com>wrote:

> Speech API Community Group,
>
> Greetings. I have some ideas for the JavaScript Speech API pertaining to
> lexical context, speech recognition and synthesis.
>
> One idea pertains to the use of <meta> and <link> elements in HTML5
> documents to indicate metadata and external resources of use to speech
> synthesis and recognition components, for example pronunciation lexicons.
> Presently, <lexicon> elements can be indicated in SRGS and SSML.
>
> An example usage scenario is a multimedia forum where users can upload
> video content and transcripts or have recipient computers generate such
> transcripts. IPA pronunciations as well as pronunciations from other
> alphabets can be processed from audio. For interrelated documents, such as
> documents in discussion threads, for example scientific discussions with
> technical terminology, lexical context data can enhance speech recognition
> and synthesis.
>
> In addition to the aforementioned use of <meta> and <link> elements in
> HTML5, such data can also be indicated in document XML. The EPUB3 format,
> for example, includes XML attributes for pronunciation. An API topic
> includes a means of passing a DOMElement to an interface function for
> obtaining such lexical data from XML.
>
> Another API topic is some sort of Lexicon API so that lexicon data can be
> indicated programmatically. While <lexicon> elements can be indicated in
> SRGS and SSML, the use of <meta> and <link> and a Lexicon API could enhance
> contextual speech synthesis, recognition and dictation.
>
>
>
> Kind regards,
>
> Adam Sobieski
>
>

Received on Friday, 7 September 2012 21:34:40 UTC