Re: Upcoming changes to BCP47 (language tag) syntax

Addison wrote:
> 
> - Use 'cmn' when you need to indicate "Mandarin Chinese" as distinct 
> from Chinese in general. This case is rare and should be avoided 
> whenever possible. A better way to write that would be: "Do NOT use 
> 'cmn' unless you have a Very Good Reason."

I'm a little worried by this text. These tags are intended to identify 
audio and text content. It is common for those classifying audiovisual 
content to maintain separate lists of "dubbing" (or audio language) and 
"subtitle" or (text languages). This strategy is built into many of the 
standards we use today.

I need to distinguish dubbed Taiwanese Mandarin from Mainland Mandarin 
from Cantonese. With RFC 4646bis as it is currently written, I planned to 
consider "zh" to be a text or subtitle language, with extensions for 
script variants, and use "cmn" and "yue" for the audio languages with 
their own variants. (In subtitle form, I need to distinguish Mandarin, 
Taiwanese, and Cantonese.) Previous ISO language standards mostly ignored 
audio forms of Chinese creating a lot of dirty data in my industry because 
we need these classifications. I hope we're not making this mistake again.

The problem with the strategy above is that I will then have two unique 
tags for the same language audience (dubbed or subtitled mainland Mandarin 
speakers, for example). I thought this was a good argument for extlang, 
which has been removed from RFC 4646bis.

I don't think that the use of cmn will be "very rare" so I would not like 
to Addison's advice to avoid "cmn" written into any guidelines.

This sounds more and more like there's a narrowing the semantic of the 
original "zh" tag. We've got a lot of Cantonese and Taiwanese content 
categorized against the zh tag today and I know this is true of the other 
massive content stores at other studios. 

Regards,

Karen Broome
Sony Pictures Entertainment

Received on Tuesday, 22 January 2008 20:01:01 UTC