- From: Baggia Paolo <paolo.baggia@loquendo.com>
- Date: Fri, 28 Jul 2006 14:03:20 +0200
- To: <www-voice@w3.org>, "Al Gilman" <Alfred.S.Gilman@IEEE.org>
- Cc: "Baggia Paolo" <paolo.baggia@loquendo.com>
Issue R102 Proposed Classification: Feature Request Resolution: Accepted --- Dear Al Gilman, Thank you for your insightful comments. The subject of pronunciation selection has been mentioned by several reviewers. The PLS team has given considerable attention to the topic and we agree that adding a mechanism greatly improves the specification. Returning to the homograph issue, the English language provides different pronunciations for 'read' depending on whether the word is used in the present or past tense. This might argue for part of speech as a determinant. On the other hand, a term such as 'Lima' (bean or city in Peru) or 'bass' (fish or musical instrument) may require a different determinant. Instead of enumerating the options, which we believe to be extremely difficult if not impossible, we have adopted an alternative that permits both standard selection mechanisms and allows for easy extensions. Your comment offered two different mechanisms: a QName based selection process or an XPath based one. We examined each and have been persuaded by the power and simplicity of the QName approach. The PLS specification will be updated to allow a 'role' attribute on the <lexeme> element. This will take a list of QNames. Users of PLS such as SRGS and SSML will then be able to select which role is desired. An example may be helpful. Looking at the 'read' vs. 'read' case in SSML: <?xml version="1.0"?> <speak version="1.1" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:claws="http://www.example.com/claws7tags" xml:lang="en"> <voice gender="male" age="3"> Can you <token role="claws:vvi">read</token> this book to me?</voice> <voice gender="male" age="35"> I've already <token role="claws:vvn">read</token> it three times!</voice> </speak> Here the part of speech from the CLAWS tagger is used [1]. The corresponding lexicon might look like <?xml version="1.0"?> <lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" xmlns:claws="http://www.example.com/claws7tags" alphabet="ipa" xml:lang="en-US"> <lexeme role="claws:vvi"> <grapheme>read</grapheme> <phoneme>rid</phoneme> </lexeme> <lexeme role="claws:vvn"> <grapheme>read</grapheme> <phoneme>rɛd</phoneme> </lexeme> </lexicon> Allowing a list of QNames in PLS allows lexicon entries to be marked with multiple tags (e.g. CLAWS7 vs CLAWS5 vs SEC vs ...) when required. This addresses the issue of pronunciation tagging. Solving the other half of the problem will require changes to SSML and to SRGS. Already the requirements for the SSML 1.1 specification are being collected and are expected to include a new <token> element on which a similar 'role' attribute might be specified. [1] http://www.comp.lancs.ac.uk/ucrel/claws7tags.html Please indicate whether you are satisfied with the VBWG's resolution, whether you think there has been a misunderstanding, or whether you wish to register an objection. I apologize for the long review time, Paolo Baggia, editor PLS spec. Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia S.p.A. ================================================ CONFIDENTIALITY NOTICE This message and its attachments are addressed solely to the persons above and may contain confidential information. If you have received the message in error, be informed that any use of the content hereof is prohibited. Please return it immediately to the sender and delete the message. Should you have any questions, please send an e_mail to <mailto:webmaster@telecomitalia.it>webmaster@telecomitalia.it. Thank you<http://www.loquendo.com>www.loquendo.com ================================================
Received on Friday, 28 July 2006 12:03:41 UTC