W3C home > Mailing lists > Public > public-i18n-its@w3.org > April to June 2005

Describing other cultural aspects of the content

From: Masaki Itagaki <masaki_itagaki@aliquantuminc.com>
Date: Fri, 15 Apr 2005 02:38:49 -0600
To: <public-i18n-its@w3.org>
Message-ID: <129.16778.1120552821@automsgid.listhub.w3.org>
Basically this requirement is the same as the one in the original draft, but I added the issue of writing styles as I discussed in the MLs and conference calls. As to the core portion of this requirement, it's highly likely that I missed something or am getting something wrong. Please post any comments on this.





It must be possible to declare more information about content than a language/locale for better text parsing and content reusability. Aspects that require finer granularity of content specification may include script usages, geographical areas, dialects or content context. The declaration of such an attribute should be done at the beginning of a document. Any content within a document which varies from the primary declaration should be labeled appropriately. 


Background :

In order to successfully and efficiently parse document content, there should be more information than a language or a locale. Examples of issues are: 

           A language/locale cannot perfectly represent orthography: e.g. “zh-CN” does not stipulate if it’s simplified or traditional Chinese. Locale for Yugoslavia does not provide guidance as to whether the language should be writeen in Latin or Cyrillic scripts. 

           Multiple cultural preferences within one locale: e.g. In Japanese (“ja-JP”), there are two official date formats – Japanese emperor date (Wareki) and a standard numeric date format (Yoreki).

           Finer language variations: e.g. how does one indicate that a voice track is in the language spoken in German-speaking Switzerland rather than the language written there, since one is Schwytzertuutsch (Swiss Germen) and the other is very close to but not the same as 'High German'? How does one indicate that a piece of content is in 'International Spanish'? How does one indicate that this is English as spoken in the time of Chaucer?

           Different writing styles and tones in one language: e.g. Japanese uses a polite style (“Desu/masu tone”) for user guides and a formal style (“Da/dearu tone”) for academic and legal content. Italian uses an informal style for software help content and a formal style for user guides.   


Identifying these variations is very important especially for content reusability. For example, the same source-language content could be translated into two different target-language content units depending on context that leads to different writing styles (e.g. formal and informal in Italian). When the content is reused both in source and target languages, context information (such as whether the content is for a user guide or a user help) must be provided in order to reuse content with an appropriate writing style.   


Received on Friday, 15 April 2005 08:40:03 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:43:04 UTC