Summary of the Second Workshop on Internationalizing SSML

Summary of the second Workshop on internationalizing SSML

On 30-31 May the Voice Browser Working Group held the second workshop
on internationalizing SSML in Crete, Greece.

The minutes of the workshop are available on the W3C Web server:
http://www.w3.org/2006/02/SSML/minutes.html

There were more than 20 attendees of various nationalities such as:
Finland, France, Greece, Hungary, India, Italy, Japan, Poland, Russia,
Slovenia, Syria, UK, and US

Motivation for internationalizing SSML includes:
* It is estimated that within 3 years the World Wide Web will contain
  significantly more content from currently under-represented languages.
* There is great need for SSML to work for languages beyond those
  supported by current version (=SSML 1.0).
* Some languages such as Chinese or Indian are difficult to
  input via a telephone keypad.
* Many other languages would also benefit from a new
  "international" version of SSML, and it would help spread the
  Web to places where it is not so readily accessible.

We discussed many issues in the Workshop, and got various knowledge on
internationalizing SSML. Topics in the Workshop included:
* Prosody (Tone, Stress, Duration, ...)
* Multiple languages (Dialects, Loan words, ...)
* Speaking styles (Deletion of schwa & diacritics, Inflection, ...)
* Tokens (Syllables, Words, ...)
* Phonemic/Phonetic alphabet (IPA, ...)
* Disambiguation of homograph (POS, Date, Digits, ...)
* Preprocessing (Text analysis, Prosody prediction, ...)
* Other extensions (interpret-as, Emotion, Cooperation with PLS, ...)

The major "takeaways" are:
* Prosodic control in Middle Eastern and Russian languages must be
  also represented in some phonetic alphabets.
* Needs for using the <token> element as the basic "word" unit. There
  was also a discussion that several levels of <token> might be
  required.
* There is still big interest in using POS in SSML among researchers of
  Middle Eastern and East European language experts
* There is a well organized mechanisms for Middle Eastern or East
  European TTS such as Arabic or Hungarian that should be considered
  to be incorporated into SSML 1.1.
* The multitude of Indian language and dialects are written using
  different scripts. There are some attempts to identify Indian oral
  tradition with the phonemic root base like 'InPho' scheme.
* The meanings of 'xml:lang'; it should not be used to select the voice,
  but just to indicated the language of the text.

And some of the discussion in the Workshop implied needs for further
consideration of:
* The trade off between markup and adding vowels or other orthographic
  changes to the text in some languages to resolve ambiguities
* Potential best practices where finely tuned templates have
  information inserted from databases at run time, e.g.. times, dates,
  numbers, etc.

We plan to continue work on the next version of SSML with the second f2f
meeting in Hong Kong in July. The charter for new SSML and concrete
time line to complete SSML 1.1 should be decided in the meeting.

  Jim Larson and Kazuyuki Ashimura, Workshop Co-chairs

Received on Friday, 7 July 2006 21:23:46 UTC