Re: comment for PLS Last Call (R102)

Issue R102

Proposed Classification: Feature Request

Resolution: Accepted

---
Dear Al Gilman,

Thank you for your insightful comments.  The subject of pronunciation
selection has been mentioned by several reviewers. The PLS team has 
given considerable attention to the topic and we agree that adding 
a mechanism greatly improves the specification.

Returning to the homograph issue, the English language provides 
different pronunciations for 'read' depending on whether the word is
used in the present or past tense.  This might argue for part of 
speech as a determinant.  On the other hand, a term such as 'Lima' 
(bean or city in Peru) or 'bass' (fish or musical instrument) may 
require a different determinant.  Instead of enumerating the options, 
which we believe to be  extremely difficult if not impossible, we have 
adopted an alternative  that permits both standard selection mechanisms
and allows for easy extensions.

Your comment offered two different mechanisms: a QName based selection
process or an XPath based one.  We examined each and have been 
persuaded by the power and simplicity of the QName approach.  
The PLS specification will be updated to allow a 'role' attribute on 
the <lexeme> element. This will take a list of QNames. Users of PLS such
as SRGS and SSML will then be able to select which role is desired.

An example may be helpful.  Looking at the 'read' vs. 'read' case 
in SSML:

<?xml version="1.0"?>
<speak version="1.1" xmlns="http://www.w3.org/2001/10/synthesis"
         xmlns:claws="http://www.example.com/claws7tags" xml:lang="en">
   <voice gender="male" age="3">
      Can you <token role="claws:vvi">read</token> this book to me?</voice>
   <voice gender="male" age="35">
      I've already <token role="claws:vvn">read</token> it three times!</voice>
</speak>

Here the part of speech from the CLAWS tagger is used [1].  
The corresponding lexicon might look like

<?xml version="1.0"?>
<lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:claws="http://www.example.com/claws7tags" alphabet="ipa"
      xml:lang="en-US">
   <lexeme role="claws:vvi">
      <grapheme>read</grapheme>
      <phoneme>rid</phoneme>
   </lexeme>
   <lexeme role="claws:vvn">
      <grapheme>read</grapheme>
      <phoneme>rɛd</phoneme>
   </lexeme>
</lexicon>

Allowing a list of QNames in PLS allows lexicon entries to be marked 
with multiple tags (e.g. CLAWS7 vs CLAWS5 vs SEC vs ...) when required.  
This addresses the issue of pronunciation tagging.  Solving the other half 
of the problem will require changes to SSML and to SRGS.  Already the 
requirements for the SSML 1.1 specification are being collected and are 
expected to include a new <token> element on which a similar 'role' 
attribute might be specified.

[1] http://www.comp.lancs.ac.uk/ucrel/claws7tags.html

Please indicate whether you are satisfied with the VBWG's resolution, 
whether you think there has been a misunderstanding, or whether you 
wish to register an objection.

I apologize for the long review time,

Paolo Baggia, editor PLS spec.



Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia S.p.A.

================================================
CONFIDENTIALITY NOTICE
This message and its attachments are addressed solely to the persons above and may contain confidential information. If you have received the message in error, be informed that any use of the content hereof is prohibited. Please return it immediately to the sender and delete the message. Should you have any questions, please send an e_mail to <mailto:webmaster@telecomitalia.it>webmaster@telecomitalia.it. Thank you<http://www.loquendo.com>www.loquendo.com
================================================

Received on Friday, 28 July 2006 12:03:41 UTC