Re: Audio Rendering of MathML from Al Gilman on 2003-04-09 (www-voice@w3.org from April to June 2003)

From: Al Gilman <asgilman@iamdigex.net>
Date: Tue, 08 Apr 2003 22:22:28 -0400
To: "Helder Ferreira" <hfilipe@fe.up.pt>, <www-math@w3.org>, <www-voice@w3.org>
Cc: <dfreitas@fe.up.pt>
Message-Id: <5.1.0.14.2.20030408111957.01e076c0@pop.iamdigex.net>

At 11:03 AM 2003-04-08, Helder Ferreira wrote:
>Greetings everyone!
>
>I'm working and studying at the Faculty of Engineering University of 
>Porto, Portugal, and my final project consists in parsing scientifc texts 
>for audio rendering.
>
>Basically i'm developping a parser in perl to parse MathML to plain text, 
>and then send it to a TTS engine.
>What i would like to know is if there is already something done in this 
>area, using or not, the same techology as I (perl in my case). Does anyone 
>have any good suggestions?
>One of the aims of my project is also to study F0 contours and other 
>specific speech parameters to tag the rendered text with additional 
>information that can be used by a TTS engine.
>Is there anyone out there working with the semantics/pronounciation of 
>mathematical expressions?
>
>Will MathML in the future provide any markup tags or attributes for speech 
>tagging? Aural CSS doesn't seem enough.
>
>The Mathml 2.0 Spec is not very specific about audio rendering.
>
>At the moment i'm also developping something (still in the beginning) that 
>we have baptized as Text Processing Markup Language - TPML, which, i hope, 
>will allow the mathml parser to add additional speech synthesis 
>information that can be sent to a TTS engine.
>
>I'm open to suggestions and new ideas.

Let's take these in reverse order.

TPML:  Have you reviewed SSML?

  Speech Synthesis Markup Language Version 1.0
  http://www.w3.org/TR/speech-synthesis

Audio presentation of Math:

The classic work on this topic is the AsTeR system done by T.V. Raman as his
doctoral research.

Google for "audio mathematics raman"

You will find that the idea of the other commentor, that of targeting
VoiceXML, will be closer to what Raman reveals is desirable than
transcribing to plain text or SSML.

Equations are not very usable when read linearly, even when spoken by a live
mathematician.  They must be regarded as complex expressions rather than
linear text.  Letting the user walk the expression tree is the way for them
to be able to get their ears around the structure that is at hand.

You should also familiarize yourself a little with the commercial solution
known as MathTalk.  <http://www.mathtalk.com/>.

Another good source is the EASI site.  Google for 'EASI'.

Al

>Thanks everyone.
>
>
>     Helder Ferreira
>
>
>     Laboratory for Speech Synthesis, Electroacustics, Signals and 
> Instrumentation
>     Faculty of Engineering University of Porto

Received on Tuesday, 8 April 2003 22:22:36 UTC