W3C home > Mailing lists > Public > www-international@w3.org > October to December 1997

Re: (i18n.390) RE: Transliteration standards: possible impact on internationaliz ation

From: <Harald.T.Alvestrand@uninett.no>
Date: Wed, 19 Nov 1997 05:17:46 -0500
To: mgm@sybase.com (Michael G. McKenna)
cc: rosenne@NetVision.net.il, manuel.carrasco@emea.eudra.org, Converse@sesame.demon.co.uk, i18n@dkuug.dk, xojig@xopen.co.uk, sc22wg14@dkuug.dk, www-international@w3.org, wgi18n@terena.nl, keld@dkuug.dk
Message-ID: <22818.879934666@dale.uninett.no>

mgm@sybase.com said:
> Unfortunately, the 639 language code does not cover regional 
> differences, for instance between US English and International 
> English.
As long as regions = countries, ISO 639 specifies exactly how to do it.

For instance, Russian Russian is "ru RU", Byelorussian Russian is
"ru BY".

The still-not-standardized ISO 639 part 2 suggests adding codes to cover
some aspects of history, for instance "ger" is German, "gmh" is
Middle High German (1050-1500), and "goh" is Old High German (750-1050).

My feeling that we won't know what we want until we get some real
experience from real life indicating what we want to differentiate on
is the reason RFC 1766 gives an essentially open field for registering
specific differences that people need.

My feeling is also that schemes like t-<sl>-<sd>-<ss>-<tl>-<td>-<ts>
are essentially useless; they give us tags that are monstrous overkill
for the simple case, and still can't express what we want to express
in the complex case.

If you want to really go overboard, why not add a Text-History tag:

<event>Translated from French to Hebrew with Latin script
       (because of lack of a proper typewriter)
<event>Transliterated from Latin script to Hebrew script
       (but tone marks left out because of 10646 Level 1 compliance)
<event>Tone marks added
<event>41 errors introduced by the above process corrected by
       a French-speaking Hasidic Jew
<event>A serious religious misunderstanding based on Hasidic tradition
       corrrected by a right-thinking Sephardic Jew
(here goes the mess that resulted from all these operations)

The Language tag should IMHO be *only* the *CURRENT* language of the
script, to an useful approximation.

My $0.03 (when I type this much, I want more than 2 cents :-)

                         Harald A
Received on Wednesday, 19 November 1997 05:18:03 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:17 UTC