- From: Peter Moffatt <peter.moffatt@nortel.com>
- Date: Thu, 24 Mar 2005 16:22:54 +0100
- To: "'www-voice@w3.org'" <www-voice@w3.org>
- Message-ID: <8F20221FB47FD51190AD00508BCF36BA054718F1@znsgy0k3.europe.nortel.com>
Hi, Apologies if I've missed something, but this WG discussion is what I was getting at in my recent post. PLS needs to account for the ability of TTS engines to disambiguate heterosyntactic homographs using part-of-speech information, either derived from the text during the TTS process or specified in mark-up. The least disruptive way to achieve this would be to permit part-of-speech designations. Simple example (using SAPI-style POS, SAMPA phonemes): <lexeme pos="noun"> <grapheme>record</grapheme> <phoneme>rekO:d</phoneme> </lexeme> <lexeme pos="verb"> <grapheme>record</grapheme> <phoneme>r@kO:d</phoneme> </lexeme> Complex example (using Penn Treebank-style POS): <lexeme poslist="vb nn nnp vbp"> <grapheme>read</grapheme> <phoneme>ri:d</phoneme> </lexeme> <lexeme poslist="vbn vbd"> <grapheme>read</grapheme> <phoneme>red</phoneme> </lexeme> Some kind of priority mechanism is still required for homosyntactic homographs; to follow the VXML precedent, document order would be the obvious way to do that. Regards, Peter Moffatt
Received on Thursday, 24 March 2005 15:24:06 UTC