Synthesis and Recognition of Mathematical and Scientific Notation

Voice Browser Working Group, Greetings.  I would like to describe some topics, for discussion, with regard to the synthesis and recognition of mathematical and scientific notation. With regard to prosody, prosodic variations can be expressed with pronunciation lexicons (PLS), by means of IPA suprasegmentals http://en.wikipedia.org/wiki/International_Phonetic_Alphabet#Suprasegmentals .  With such an approach, different variations of each greek letter and each mathematical object can be indicated with distinct suprasegmental structure.  The speech recognition capabilities of detecting the suprasegmental structure of utterances is topical for such an approach as well as for other approaches. Vocalizations of mathematical and scientific notation often include semantically meaningful prosody.  It occurs that SRGS can become even more expressive with regard to prosody including at least the expressiveness of SSML.  Prosody-related topics could be salient with regard to editioning SSML and SRGS.  Other contemporary topics include the speech synthesis of document object model nodes and generating XML, RDFa, and MathML during speech recognition. For example, SSML can become even more capable of articulating the differences between: e^x + 6 and e^(x+6) and SRGS can become even more capable of describing prosody in grammatical structure.  For example, f(cos(x^2)) + 2ab + 5(y - 2) Also, there are online corpora available containing audio of spoken mathematics (e.g. http://ocw.mit.edu/courses/mathematics/, http://www.cosmolearning.com/mathematics/, http://freevideolectures.com/Subject/Mathematics).  From such corpora, we can select audio clips of spoken mathematics, create MathML3 transcripts for those clips, and create resources of use for research and development regarding the synthesis and recognition of mathematical and scientific notation.   Kind regards, Adam Sobieski 


 		 	   		  

Received on Friday, 20 April 2012 14:04:58 UTC