- From: Glen Shires <gshires@google.com>
- Date: Fri, 7 Sep 2012 14:33:29 -0700
- To: Adam Sobieski <adamsobieski@hotmail.com>
- Cc: "public-speech-api@w3.org" <public-speech-api@w3.org>
- Message-ID: <CAEE5bcijfvzADjBjB9uRCvcxngo-0udJ-P4qv+_xN0mFO457Cw@mail.gmail.com>
If I'm understanding correctly, this is possible with the current spec. JavaScript code (or for that matter, webserver-side code), can extract metadata and resources and insert them into the SRGS and SSML. Such code would provide a highly flexible, powerful and customizable solution. /Glen Shires On Fri, Sep 7, 2012 at 2:19 PM, Adam Sobieski <adamsobieski@hotmail.com>wrote: > Speech API Community Group, > > Greetings. I have some ideas for the JavaScript Speech API pertaining to > lexical context, speech recognition and synthesis. > > One idea pertains to the use of <meta> and <link> elements in HTML5 > documents to indicate metadata and external resources of use to speech > synthesis and recognition components, for example pronunciation lexicons. > Presently, <lexicon> elements can be indicated in SRGS and SSML. > > An example usage scenario is a multimedia forum where users can upload > video content and transcripts or have recipient computers generate such > transcripts. IPA pronunciations as well as pronunciations from other > alphabets can be processed from audio. For interrelated documents, such as > documents in discussion threads, for example scientific discussions with > technical terminology, lexical context data can enhance speech recognition > and synthesis. > > In addition to the aforementioned use of <meta> and <link> elements in > HTML5, such data can also be indicated in document XML. The EPUB3 format, > for example, includes XML attributes for pronunciation. An API topic > includes a means of passing a DOMElement to an interface function for > obtaining such lexical data from XML. > > Another API topic is some sort of Lexicon API so that lexicon data can be > indicated programmatically. While <lexicon> elements can be indicated in > SRGS and SSML, the use of <meta> and <link> and a Lexicon API could enhance > contextual speech synthesis, recognition and dictation. > > > > Kind regards, > > Adam Sobieski > >
Received on Friday, 7 September 2012 21:34:40 UTC