Fwd: [SSML11] i18n comment 4: zh-CN-HK

Forwarding to VBWG to include remaining SSML participants  
(particularly in China) on this thread.


Here are my comments:
1) Forgot to point out in my reply to Richard's summary that we had  
already agreed to switch de-SU to de-CH (a goof on my part, using  
"Suisse" rather than "Confoederatio Helvetica").
2) Actually we do specifically intend to indicate that the Hong Kong  
accent is Cantonese.  So how can we do this?

-- dan


Begin forwarded message:

> From: "Phillips, Addison" <addison@amazon.com>
> Date: May 7, 2008 3:38:55 PM EDT
> To: Dan Burnett <dburnett@voxeo.com>, Richard Ishida <ishida@w3.org>
> Cc: "jim@larson-tech.com" <jim@larson-tech.com>, "ashimura@w3.org"  
> <ashimura@w3.org>, "scott.mcglashan@hp.com"  
> <scott.mcglashan@hp.com>, "public-i18n-core@w3.org" <public-i18n- 
> core@w3.org>
> Subject: RE: [SSML11] i18n comment 4: zh-CN-HK
>
> Hi Dan,
>
> I have the action item. There is some dispute right now about  
> handling tagging of Chinese languages--especially in an audio  
> context--so I'm hesitant to give a Chinese example. However, once  
> the dust settles it would be good to provide one in your document  
> in particular.
>
> That said, I've just looked at the text in question, which reads:
>
> --
> For example, a languages value of "en:zh fr:de" can legally be  
> matched by any voice that can both read English (speaking it with a  
> Chinese accent) and read French (speaking it with a German accent).  
> Thus, a voice that only supports "en-US" with a "zh-CN-HK" accent  
> and "fr-CA" with a "de-SU" accent would match. As another example,  
> if we have <voice languages="fr:zh"> and there is no voice that  
> supports French with a Chinese accent, then a voice selection  
> failure will occur. Note that if no accent indication is given for  
> a language, then any voice that speaks the language is acceptable,  
> regardless of accent. Also, note that author control over language  
> support during voice selection is independent of any value of  
> xml:lang in the text.
> --
>
> To make it both correct and non-controversial requires only a minor  
> change. I would suggest changing it thus:
>
> --
> For example, a languages value of "en:zh fr:de" can legally be  
> matched by any voice that can both read English (speaking it with a  
> Chinese accent) and read French (speaking it with a German accent).  
> Thus, a voice that only supports "en-US" with a "zh-HK" accent and  
> "fr-CA" with a "de-AT" accent would match. As another example, if  
> we have <voice languages="fr:zh"> and there is no voice that  
> supports French with a Chinese accent, then a voice selection  
> failure will occur. Note that if no accent indication is given for  
> a language, then any voice that speaks the language is acceptable,  
> regardless of accent. Also, note that author control over language  
> support during voice selection is independent of any value of  
> xml:lang in the text.
> --
>
> That is: s/zh-CN-HK/zh-HK/ and s/de-SU/de-AT/
>
> The tag "zh-CN-HK" is, as noted, illegal. The tag "zh-HK" means  
> "Chinese as used in Hong Kong SAR" (note that this suggests but  
> does not specify a "Cantonese" accent). The tag "de-SU" would be  
> "German as used in the former Soviet Union", which is possible  
> (see: Kalingrad), but also extremely unlikely. The 'AT' subtag  
> represents Austria.
>
> It should be noted that the current debate about tagging Chinese  
> partially revolves around the fact that spoken Chinese languages/ 
> dialects, while all being "Chinese", are not all mutually  
> intelligible. The debate is whether language tags should take the  
> form of "zh-(something)" (indicating the relationship to Chinese)  
> or just use their specific language subtags directly (such as 'yue'  
> for Cantonese, 'cmn' for Mandarin, 'nan' for Min Nan, etc.) If the  
> SSML WG has an opinion about this, it would be extremely valuable  
> to the I18N WG and those of us engaged in work on language  
> identification. I'd be happy to provide (in a separate thread)  
> suitable background, etc.
>
> Best Regards,
>
> Addison
>
> Addison Phillips
> Globalization Architect -- Lab126
>
> Internationalization is not a feature.
> It is an architecture.
>
>
>> -----Original Message-----
>> From: public-i18n-core-request@w3.org [mailto:public-i18n-core-
>> request@w3.org] On Behalf Of Dan Burnett
>> Sent: Wednesday, May 07, 2008 12:15 PM
>> To: Richard Ishida
>> Cc: jim@larson-tech.com; ashimura@w3.org; scott.mcglashan@hp.com;
>> public-i18n-core@w3.org
>> Subject: Re: [SSML11] i18n comment 4: zh-CN-HK
>>
>>
>> s/an accent that is different/an accent that is different from the
>> expected/common accent for the voice's language/
>>
>> Also, we would love to receive a new/better example from Addison that
>> meets this criterion.
>>
>> -- dan
>>
>> On May 7, 2008, at 3:13 PM, Richard Ishida wrote:
>>
>>> My notes from the FTF in Beijing:
>>>
>>> Happy to change the tag, but want to keep the idea of an accent
>>> that is
>>> different.
>>> Question about whether to use yue or zh-yue.
>>>
>>> RI
>>>
>>> ============
>>> Richard Ishida
>>> Internationalization Lead
>>> W3C (World Wide Web Consortium)
>>>
>>> http://www.w3.org/International/
>>> http://rishida.net/blog/
>>> http://rishida.net/
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: public-i18n-core-request@w3.org
>>> [mailto:public-i18n-core-request@w3.org]
>>>> On Behalf Of ishida@w3.org
>>>> Sent: 07 April 2008 16:22
>>>> To: dburnett@voxeo.com; jim@larson-tech.com; ashimura@w3.org;
>>>> scott.mcglashan@hp.com; public-i18n-core@w3.org
>>>> Subject: [SSML11] i18n comment 4: zh-CN-HK
>>>>
>>>>
>>>> Comment from the i18n review of:
>>>> http://www.w3.org/TR/2008/WD-speech-synthesis11-20080317/
>>>>
>>>> Comment 4
>>>> At http://www.w3.org/International/reviews/0804-ssml11/ 
>>>> Overview.html
>>>> Editorial/substantive: E
>>>> Tracked by: AP
>>>>
>>>> Location in reviewed document:
>>>> 3.2.1 [http://www.w3.org/TR/2008/WD-speech-synthesis11-20080317/
>>>> #S3.2.1]
>>>>
>>>> Comment:
>>>> zh-CN-HK is an illegal language tag (in one of the examples). It
>>>> might be
>>> better to
>>>> avoid a chinese example, at least initially ... if you want
>>>> control over
>>> which
>>>> *langauge* is used, you should use cmn or yue tags rather than zh-
>>>> CN etc.
>>>>
>>>>
>>>> Addison Phillips has taken an action to propose an alternative
>>>> paragraph
>>> or two for
>>>> the example.
>>>>
>>>>
>>>
>>>
>>
>

Received on Wednesday, 7 May 2008 20:06:31 UTC