RE: [PLS1.0] i18n comment: TTS vs. ASR in 4.5

Hi Paolo,
 
Please see the comments at http://lists.w3.org/Archives/Public/www-voice/2006AprJun/0118.html.
 
Thanks,
RI


============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/People/Ishida/
http://www.w3.org/International/
http://people.w3.org/rishida/blog/
http://www.flickr.com/photos/ishida/


 


  _____  

From: Baggia Paolo [mailto:paolo.baggia@loquendo.com] 
Sent: 26 May 2006 15:20
To: www-voice@w3.org
Cc: Baggia Paolo; Richard Ishida
Subject: Re: [PLS1.0] i18n comment: TTS vs. ASR in 4.5



Issue R103-26

Proposed Classification: Clarification / Typo / Editorial 

Resolution: Reject 

The following text appears in Section 4.5 [1]: 

"In order to remove the need for duplication of pronunciation information to cope with the above variations, the <lexeme> element may contain more than one <grapheme> element to define the base orthography and any variants which should share the pronunciations." 

We believe that there is general utility, beyond text-to-speech, for supporting multiple graphemes. To illustrate one such case, the following lexicon might be used for US English: 

 <?xml version="1.0" encoding="UTF-8"?>

 <lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"

       alphabet="ipa" xml:lang="en-US">

     

     <lexeme> 

       <grapheme>judgment<\grapheme> 

       <grapheme>judgement<\grapheme> 

       <phoneme>ˈʤʌʤ.mənt<\phoneme> 

     <\lexeme> 

     <lexeme> 

       <grapheme>fiancé<\grapheme> 

       <grapheme>fiance<\grapheme> 

       <phoneme>fiˈɑ̃ːn.seɪ<\phoneme> 

       <phoneme>ˌfiː.ɑːnˈseɪ<\phoneme> 

     <\lexeme> 

 </lexicon>

In text-to-speech documents, as has been noted, 

 <?xml version="1.0" encoding="UTF-8"?>

 <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"

   xml:lang="en-US">

     

     <lexicon uri="http://www.example.com/lexicon_defined_above.xml"/> 

     

     <p> In the judgement of my fiancé, Las Vegas is the best place for a honeymoon.

       I replied that I preferred Venice and didn't think the Venetian casino was an

       acceptable compromise.<\p> 

 </speak>

but also in speech recognition grammars, 

 <?xml version="1.0" encoding="UTF-8"?>

 <grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar" 

   xml:lang="en-US" root="movies">

     

     <lexicon uri="http://www.example.com/lexicon_defined_above.xml"/> 

     

     <rule id="movies" scope="public"> 

       <one-of> 

         <item>Terminator 2: Judgment Day<\item> 

         <item>My Big Fat Obnoxious Fiance<\item> 

         <item>Pluto's Judgement Day<\item> 

       <\one-of> 

     <\rule> 

 </grammar>

We feel that this is used both for TTS and ASR therefore we reject your proposal to add only "text-to-speech".

Please indicate whether you are satisfied with the VBWG's resolution, whether you think there has been a misunderstanding, or whether you wish to register an objection. 

[1]  <http://www.w3.org/TR/2006/WD-pronunciation-lexicon-20060131/> http://www.w3.org/TR/2006/WD-pronunciation-lexicon-20060131/#S4.5 

Paolo Baggia, editor PLS spec.

P.S. If you have trouble to see the IPA codes, please ask me. I’ll upload a HTML

document and send you the URI.

From: < <mailto:ishida@w3.org?Subject=Re%3A%20%5BPLS1.0%5D%20i18n%20comment%3A%20TTS%20vs.%20ASR%20in%204.5&In-Reply-To=%253C20060321175010.01B574F400%40homer.w3.org%253E&References=%253C20060321175010.01B574F400%40homer.w3.org%253E> ishida@w3.org>
Date: Tue, 21 Mar 2006 17:50:11 +0000
To:  <mailto:www-voice@w3.org?Subject=Re%3A%20%5BPLS1.0%5D%20i18n%20comment%3A%20TTS%20vs.%20ASR%20in%204.5&In-Reply-To=%253C20060321175010.01B574F400%40homer.w3.org%253E&References=%253C20060321175010.01B574F400%40homer.w3.org%253E> www-voice@w3.org,  <mailto:public-i18n-core@w3.org?Subject=Re%3A%20%5BPLS1.0%5D%20i18n%20comment%3A%20TTS%20vs.%20ASR%20in%204.5&In-Reply-To=%253C20060321175010.01B574F400%40homer.w3.org%253E&References=%253C20060321175010.01B574F400%40homer.w3.org%253E> public-i18n-core@w3.org
Message-Id: <20060321175010.01B574F400@homer.w3.org> 

 

 Comment from the i18n review of:

  <http://www.w3.org/TR/2006/WD-pronunciation-lexicon-20060131/> http://www.w3.org/TR/2006/WD-pronunciation-lexicon-20060131/

 Comment 26

 At  <http://www.w3.org/International/reviews/0603-pls10/> http://www.w3.org/International/reviews/0603-pls10/

 Editorial/substantive: E

 Owner: RI

 Location in reviewed document:

 4.5, 3rd para

 Comment: 

 "In order to remove the need for duplication of pronunciation information to cope with the above variations, the<lexeme> element may"


 Here is an example of where it might be good to distinguish between TTS and ASR. You could say: "In order to remove the need for duplication of pronunciation information to cope with the above variations during text-to-speech, the <lexeme> element may contain"



Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia S.p.A.

================================================
CONFIDENTIALITY NOTICE
This message and its attachments are addressed solely to the persons
above and may contain confidential information. If you have received
the message in error, be informed that any use of the content hereof
is prohibited. Please return it immediately to the sender and delete
the message. Should you have any questions, please send an e_mail to
<mailto:webmaster@telecomitalia.it>webmaster@telecomitalia.it. Thank you
<http://www.loquendo.com>www.loquendo.com
================================================

Received on Wednesday, 14 June 2006 15:16:23 UTC