W3C home > Mailing lists > Public > www-voice@w3.org > January to March 2005

Re: [pls] Example multiple lexemes with same grapheme content

From: Peter Moffatt <peter.moffatt@nortel.com>
Date: Thu, 24 Mar 2005 16:22:54 +0100
Message-ID: <8F20221FB47FD51190AD00508BCF36BA054718F1@znsgy0k3.europe.nortel.com>
To: "'www-voice@w3.org'" <www-voice@w3.org>

Apologies if I've missed something, but this WG discussion is what I was
getting at in my recent post.

PLS needs to account for the ability of TTS engines to disambiguate
heterosyntactic homographs using part-of-speech information, either derived
from the text during the TTS process or specified in mark-up. The least
disruptive way to achieve this would be to permit part-of-speech

Simple example (using SAPI-style POS, SAMPA phonemes):

<lexeme pos="noun">
<lexeme pos="verb">

Complex example (using Penn Treebank-style POS):

<lexeme poslist="vb nn nnp vbp">
<lexeme poslist="vbn vbd">

Some kind of priority mechanism is still required for homosyntactic
homographs; to follow the VXML precedent, document order would be the
obvious way to do that.

Peter Moffatt
Received on Thursday, 24 March 2005 15:24:06 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:07:38 UTC