W3C home > Mailing lists > Public > www-international@w3.org > October to December 1998

Re: Transliteration

From: Stephen M. Gardner <gardsm@aud.alcatel.com>
Date: Wed, 21 Oct 1998 16:11:52 -0500
Message-Id: <362E4E18.842D78D4@aud.alcatel.com>
To: Albert Lunde <albert-lunde@nwu.edu>
Cc: www-international@w3.org
Albert Lunde wrote:

> But I don't think there's big differences in the way Greek
> (or Japanese or Korean or Hebrew) are romanized for French-speaking
> or English-speaking readers.

   It is a big enough difference to cause a problem. It is not easy to do an
automated text search with the "wrong" transliteration. In fact, leaving the
issue of regular expression searches aside, I bet many Americans with no
exposure to French would have difficulty identifying the Russian name from  the
French transliteration. Some examples:

   * Tchétchènes - Chechens
   * Tchernomyrdine - Chernomyrdin
   * Tchoubais - Chubais
   * Loujkov - Luzhkov

>  Romanization is _not_ a change of language; it's use of
> a different script.  There's clearly more than one way
> to romanize many languages. (And you can write English
> in other scripts, i.e. Japanese Katakana (though
> it may get a bit distorted))

    This is kind of a bad example since katakana has only one set of accepted
sound values.  Even though it seriously mangles English there is usually a
pretty well defined way to katakanize English. The reverse situation is not
true. There are two quite different accepted transliterations of Japanese into
roman script,  Hepburn and romaji as well as an informal transliteration.  Both
have an English bias due to the peculiar circumstances of the history of
Japanese contact with the west. Non roman scripted languages other than Japanese
are even more of a problem. Roman script has such a wide variety of sound values
for all the languages that use it that it almost doesn't make sense to talk
about it as a single script.  There is such a multitude of ways to transliterate
into roman script depending on the linguistic habits of the target audience.
Think of how differently the following language systems use roman letters:

   * Czech
   * English
   * French
   * German
   * Magyar
   * Polish
   * pin-yin (official romanization of Mandarin)
   * romaji (official romanization of Japanese)
   * Vietnamese

Since the roman alphabet is so widely used across so many linguistic groups,
from Spanish to Vietnamese, there is nothing vaguely resembling a standard
mapping of character to sound and that is what a transliteration depends on. So
even if romanization is not a change of language it is a change of
language-dependent mapping.

> I'd love to be able to spell-check romanized Japanese
> when I write it (which illustrates that these issues
> appear in original as well as transformed texts).

But which romanization would you use?  Hepburn or Romaji?  You will often find
both used in English texts that incorporate Japanese words and placenames.

Steve Gardner Technical  Staff Member Q3 Agent Development
1225 N. Alma Road   Tel: 972-996-5888
Richardson Tx. 75081-2206 http://ctnwww.aud.alcatel.com/~gardsm/

Still a lot of lands to see but I wouldn't want to stay here,
it's too old and cold and settled in its ways here.
     --Joni Mitchell
     "California" (Blue Album)
Received on Thursday, 22 October 1998 08:39:12 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:18 UTC