W3C home > Mailing lists > Public > www-voice@w3.org > April to June 2006

Re: [PLS1.0] i18n comment: TTS vs. ASR in 4.5

From: Baggia Paolo <paolo.baggia@loquendo.com>
Date: Fri, 26 May 2006 16:19:47 +0200
Message-ID: <F534D6940BB4C447874590AC0B2955715429BD@PTPEVS106BA020.idc.cww.telecomitalia.it>
To: <www-voice@w3.org>
Cc: "Baggia Paolo" <paolo.baggia@loquendo.com>, "Richard Ishida" <ishida@w3.org>
Issue R103-26
Proposed Classification: Clarification / Typo / Editorial 
Resolution: Reject 
The following text appears in Section 4.5 [1]: 
"In order to remove the need for duplication of pronunciation information to cope with the above variations, the <lexeme> element may contain more than one <grapheme> element to define the base orthography and any variants which should share the pronunciations." 
We believe that there is general utility, beyond text-to-speech, for supporting multiple graphemes. To illustrate one such case, the following lexicon might be used for US English: 
	<?xml version="1.0" encoding="UTF-8"?>

	<lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
	      alphabet="ipa" xml:lang="en-US">
	    
	    <lexeme> 
	      <grapheme>judgment<\grapheme> 
	      <grapheme>judgement<\grapheme> 
	      <phoneme>ˈʤʌʤ.mənt<\phoneme> 
	    <\lexeme> 
	    <lexeme> 
	      <grapheme>fiancé<\grapheme> 
	      <grapheme>fiance<\grapheme> 
	      <phoneme>fiˈɑ̃ːn.seɪ<\phoneme> 
	      <phoneme>ˌfiː.ɑːnˈseɪ<\phoneme> 
	    <\lexeme> 
	</lexicon>
In text-to-speech documents, as has been noted, 

	<?xml version="1.0" encoding="UTF-8"?>

	<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
	  xml:lang="en-US">
	    
	    <lexicon uri="http://www.example.com/lexicon_defined_above.xml"/> 
	    
	    <p> In the judgement of my fiancé, Las Vegas is the best place for a honeymoon.
	      I replied that I preferred Venice and didn't think the Venetian casino was an
	      acceptable compromise.<\p> 
	</speak>
but also in speech recognition grammars, 

	<?xml version="1.0" encoding="UTF-8"?>

	<grammar version="1.0" xmlns="http://www.w3.org/2001/06/grammar" 
	  xml:lang="en-US" root="movies">
	    
	    <lexicon uri="http://www.example.com/lexicon_defined_above.xml"/> 
	    
	    <rule id="movies" scope="public"> 
	      <one-of> 
	        <item>Terminator 2: Judgment Day<\item> 
	        <item>My Big Fat Obnoxious Fiance<\item> 
	        <item>Pluto's Judgement Day<\item> 
	      <\one-of> 
	    <\rule> 
	</grammar>
We feel that this is used both for TTS and ASR therefore we reject your proposal to add only "text-to-speech". 

Please indicate whether you are satisfied with the VBWG's resolution, whether you think there has been a misunderstanding, or whether you wish to register an objection. 
[1] http://www.w3.org/TR/2006/WD-pronunciation-lexicon-20060131/#S4.5 <http://www.w3.org/TR/2006/WD-pronunciation-lexicon-20060131/>  

Paolo Baggia, editor PLS spec.

P.S. If you have trouble to see the IPA codes, please ask me. I’ll upload a HTML
document and send you the URI.

From: <ishida@w3.org <mailto:ishida@w3.org?Subject=Re%3A%20%5BPLS1.0%5D%20i18n%20comment%3A%20TTS%20vs.%20ASR%20in%204.5&In-Reply-To=%253C20060321175010.01B574F400%40homer.w3.org%253E&References=%253C20060321175010.01B574F400%40homer.w3.org%253E> > 
Date: Tue, 21 Mar 2006 17:50:11 +0000
To: www-voice@w3.org <mailto:www-voice@w3.org?Subject=Re%3A%20%5BPLS1.0%5D%20i18n%20comment%3A%20TTS%20vs.%20ASR%20in%204.5&In-Reply-To=%253C20060321175010.01B574F400%40homer.w3.org%253E&References=%253C20060321175010.01B574F400%40homer.w3.org%253E> , public-i18n-core@w3.org <mailto:public-i18n-core@w3.org?Subject=Re%3A%20%5BPLS1.0%5D%20i18n%20comment%3A%20TTS%20vs.%20ASR%20in%204.5&In-Reply-To=%253C20060321175010.01B574F400%40homer.w3.org%253E&References=%253C20060321175010.01B574F400%40homer.w3.org%253E>  
Message-Id: <20060321175010.01B574F400@homer.w3.org> 
	
	Comment from the i18n review of:
	http://www.w3.org/TR/2006/WD-pronunciation-lexicon-20060131/

	Comment 26
	At http://www.w3.org/International/reviews/0603-pls10/
	Editorial/substantive: E
	Owner: RI

	Location in reviewed document:
	4.5, 3rd para

	Comment: 
	"In order to remove the need for duplication of pronunciation information to cope with the above variations, the<lexeme> element may"


	Here is an example of where it might be good to distinguish between TTS and ASR. You could say: "In order to remove the need for duplication of pronunciation information to cope with the above variations during text-to-speech, the <lexeme> element may contain"




Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia S.p.A.

================================================
CONFIDENTIALITY NOTICE
This message and its attachments are addressed solely to the persons above and may contain confidential information. If you have received the message in error, be informed that any use of the content hereof is prohibited. Please return it immediately to the sender and delete the message. Should you have any questions, please send an e_mail to <mailto:webmaster@telecomitalia.it>webmaster@telecomitalia.it. Thank you<http://www.loquendo.com>www.loquendo.com
================================================
Received on Friday, 26 May 2006 14:20:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 October 2006 12:49:02 GMT