Re: [SSML11] i18n comment 4: zh-CN-HK

On Wed, May 7, 2008 at 1:05 PM, Dan Burnett <dburnett@voxeo.com> wrote:

> Forwarding to VBWG to include remaining SSML participants (particularly in
> China) on this thread.
>
> Here are my comments:
> 1) Forgot to point out in my reply to Richard's summary that we had
> already agreed to switch de-SU to de-CH (a goof on my part, using "Suisse"
> rather than "Confoederatio Helvetica").
> 2) Actually we do specifically intend to indicate that the Hong Kong
> accent is Cantonese.  So how can we do this?
>

For now, to avoid the use of codes that are likely to be deprecated in the
future, you are better off using a different example to make your point.
Just change Chinese to Japanese in your examples.

A second thing:

> "As another example, if we have <voice languages="fr:zh"> and there is no
voice that supports French with a Chinese accent, then a voice selection
failure will occur."

When I read language in a spec like "a failure will occur" or "a failure
must occur" I immediately ask whether or not that is well-defined. What
degree of Chinese accent would suffice to prevent a failure? This appears to
be sufficiently imprecise as to make conformance untestable. It probably
should be "should occur" or "may occur".



> -- dan
>
>
> Begin forwarded message:
>
> *From: *"Phillips, Addison" <addison@amazon.com>
> *Date: *May 7, 2008 3:38:55 PM EDT
> *To: *Dan Burnett <dburnett@voxeo.com>, Richard Ishida <ishida@w3.org>
> *Cc: *"jim@larson-tech.com" <jim@larson-tech.com>, "ashimura@w3.org" <
> ashimura@w3.org>, "scott.mcglashan@hp.com" <scott.mcglashan@hp.com>, "
> public-i18n-core@w3.org" <public-i18n-core@w3.org>
> *Subject: **RE: [SSML11] i18n comment 4: zh-CN-HK*
>
> Hi Dan,
>
> I have the action item. There is some dispute right now about handling
> tagging of Chinese languages--especially in an audio context--so I'm
> hesitant to give a Chinese example. However, once the dust settles it would
> be good to provide one in your document in particular.
>
> That said, I've just looked at the text in question, which reads:
>
> --
> For example, a languages value of "en:zh fr:de" can legally be matched by
> any voice that can both read English (speaking it with a Chinese accent) and
> read French (speaking it with a German accent). Thus, a voice that only
> supports "en-US" with a "zh-CN-HK" accent and "fr-CA" with a "de-SU" accent
> would match. As another example, if we have <voice languages="fr:zh"> and
> there is no voice that supports French with a Chinese accent, then a voice
> selection failure will occur. Note that if no accent indication is given for
> a language, then any voice that speaks the language is acceptable,
> regardless of accent. Also, note that author control over language support
> during voice selection is independent of any value of xml:lang in the text.
> --
>
> To make it both correct and non-controversial requires only a minor
> change. I would suggest changing it thus:
>
> --
> For example, a languages value of "en:zh fr:de" can legally be matched by
> any voice that can both read English (speaking it with a Chinese accent) and
> read French (speaking it with a German accent). Thus, a voice that only
> supports "en-US" with a "zh-HK" accent and "fr-CA" with a "de-AT" accent
> would match. As another example, if we have <voice languages="fr:zh"> and
> there is no voice that supports French with a Chinese accent, then a voice
> selection failure will occur. Note that if no accent indication is given for
> a language, then any voice that speaks the language is acceptable,
> regardless of accent. Also, note that author control over language support
> during voice selection is independent of any value of xml:lang in the text.
> --
>
> That is: s/zh-CN-HK/zh-HK/ and s/de-SU/de-AT/
>
> The tag "zh-CN-HK" is, as noted, illegal. The tag "zh-HK" means "Chinese
> as used in Hong Kong SAR" (note that this suggests but does not specify a
> "Cantonese" accent). The tag "de-SU" would be "German as used in the former
> Soviet Union", which is possible (see: Kalingrad), but also extremely
> unlikely. The 'AT' subtag represents Austria.
>
> It should be noted that the current debate about tagging Chinese partially
> revolves around the fact that spoken Chinese languages/dialects, while all
> being "Chinese", are not all mutually intelligible. The debate is whether
> language tags should take the form of "zh-(something)" (indicating the
> relationship to Chinese) or just use their specific language subtags
> directly (such as 'yue' for Cantonese, 'cmn' for Mandarin, 'nan' for Min
> Nan, etc.) If the SSML WG has an opinion about this, it would be extremely
> valuable to the I18N WG and those of us engaged in work on language
> identification. I'd be happy to provide (in a separate thread) suitable
> background, etc.
>
> Best Regards,
>
> Addison
>
> Addison Phillips
> Globalization Architect -- Lab126
>
> Internationalization is not a feature.
> It is an architecture.
>
>
> -----Original Message-----
> From: public-i18n-core-request@w3.org [mailto:public-i18n-core-<public-i18n-core->
> request@w3.org] On Behalf Of Dan Burnett
> Sent: Wednesday, May 07, 2008 12:15 PM
> To: Richard Ishida
> Cc: jim@larson-tech.com; ashimura@w3.org; scott.mcglashan@hp.com;
> public-i18n-core@w3.org
> Subject: Re: [SSML11] i18n comment 4: zh-CN-HK
>
>
> s/an accent that is different/an accent that is different from the
> expected/common accent for the voice's language/
>
> Also, we would love to receive a new/better example from Addison that
> meets this criterion.
>
> -- dan
>
> On May 7, 2008, at 3:13 PM, Richard Ishida wrote:
>
> My notes from the FTF in Beijing:
>
> Happy to change the tag, but want to keep the idea of an accent
> that is
> different.
> Question about whether to use yue or zh-yue.
>
> RI
>
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
>
> http://www.w3.org/International/
> http://rishida.net/blog/
> http://rishida.net/
>
>
>
> -----Original Message-----
> From: public-i18n-core-request@w3.org
>
> [mailto:public-i18n-core-request@w3.org <public-i18n-core-request@w3.org>]
>
> On Behalf Of ishida@w3.org
> Sent: 07 April 2008 16:22
> To: dburnett@voxeo.com; jim@larson-tech.com; ashimura@w3.org;
> scott.mcglashan@hp.com; public-i18n-core@w3.org
> Subject: [SSML11] i18n comment 4: zh-CN-HK
>
>
> Comment from the i18n review of:
> http://www.w3.org/TR/2008/WD-speech-synthesis11-20080317/
>
> Comment 4
> At http://www.w3.org/International/reviews/0804-ssml11/Overview.html
> Editorial/substantive: E
> Tracked by: AP
>
> Location in reviewed document:
> 3.2.1 [http://www.w3.org/TR/2008/WD-speech-synthesis11-20080317/
> #S3.2.1]
>
> Comment:
> zh-CN-HK is an illegal language tag (in one of the examples). It
> might be
>
> better to
>
> avoid a chinese example, at least initially ... if you want
> control over
>
> which
>
> *langauge* is used, you should use cmn or yue tags rather than zh-
> CN etc.
>
>
> Addison Phillips has taken an action to propose an alternative
> paragraph
>
> or two for
>
> the example.
>
>
>
>
>
>
>
>


-- 
Mark

Received on Wednesday, 7 May 2008 20:54:30 UTC