- From: Baggia Paolo <paolo.baggia@loquendo.com>
- Date: Fri, 28 Jul 2006 14:03:20 +0200
- To: <www-voice@w3.org>, "Al Gilman" <Alfred.S.Gilman@IEEE.org>
- Cc: "Baggia Paolo" <paolo.baggia@loquendo.com>
Issue R102
Proposed Classification: Feature Request
Resolution: Accepted
---
Dear Al Gilman,
Thank you for your insightful comments. The subject of pronunciation
selection has been mentioned by several reviewers. The PLS team has
given considerable attention to the topic and we agree that adding
a mechanism greatly improves the specification.
Returning to the homograph issue, the English language provides
different pronunciations for 'read' depending on whether the word is
used in the present or past tense. This might argue for part of
speech as a determinant. On the other hand, a term such as 'Lima'
(bean or city in Peru) or 'bass' (fish or musical instrument) may
require a different determinant. Instead of enumerating the options,
which we believe to be extremely difficult if not impossible, we have
adopted an alternative that permits both standard selection mechanisms
and allows for easy extensions.
Your comment offered two different mechanisms: a QName based selection
process or an XPath based one. We examined each and have been
persuaded by the power and simplicity of the QName approach.
The PLS specification will be updated to allow a 'role' attribute on
the <lexeme> element. This will take a list of QNames. Users of PLS such
as SRGS and SSML will then be able to select which role is desired.
An example may be helpful. Looking at the 'read' vs. 'read' case
in SSML:
<?xml version="1.0"?>
<speak version="1.1" xmlns="http://www.w3.org/2001/10/synthesis"
xmlns:claws="http://www.example.com/claws7tags" xml:lang="en">
<voice gender="male" age="3">
Can you <token role="claws:vvi">read</token> this book to me?</voice>
<voice gender="male" age="35">
I've already <token role="claws:vvn">read</token> it three times!</voice>
</speak>
Here the part of speech from the CLAWS tagger is used [1].
The corresponding lexicon might look like
<?xml version="1.0"?>
<lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
xmlns:claws="http://www.example.com/claws7tags" alphabet="ipa"
xml:lang="en-US">
<lexeme role="claws:vvi">
<grapheme>read</grapheme>
<phoneme>rid</phoneme>
</lexeme>
<lexeme role="claws:vvn">
<grapheme>read</grapheme>
<phoneme>rɛd</phoneme>
</lexeme>
</lexicon>
Allowing a list of QNames in PLS allows lexicon entries to be marked
with multiple tags (e.g. CLAWS7 vs CLAWS5 vs SEC vs ...) when required.
This addresses the issue of pronunciation tagging. Solving the other half
of the problem will require changes to SSML and to SRGS. Already the
requirements for the SSML 1.1 specification are being collected and are
expected to include a new <token> element on which a similar 'role'
attribute might be specified.
[1] http://www.comp.lancs.ac.uk/ucrel/claws7tags.html
Please indicate whether you are satisfied with the VBWG's resolution,
whether you think there has been a misunderstanding, or whether you
wish to register an objection.
I apologize for the long review time,
Paolo Baggia, editor PLS spec.
Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia S.p.A.
================================================
CONFIDENTIALITY NOTICE
This message and its attachments are addressed solely to the persons above and may contain confidential information. If you have received the message in error, be informed that any use of the content hereof is prohibited. Please return it immediately to the sender and delete the message. Should you have any questions, please send an e_mail to <mailto:webmaster@telecomitalia.it>webmaster@telecomitalia.it. Thank you<http://www.loquendo.com>www.loquendo.com
================================================
Received on Friday, 28 July 2006 12:03:41 UTC