W3C home > Mailing lists > Public > www-international@w3.org > October to December 1997

Re: (i18n.390) RE: Transliteration standards: possible impact on internationaliz ation

From: Michael G. McKenna <mgm@sybase.com>
Date: Tue, 18 Nov 1997 16:27:47 -0800
Message-Id: <9711190027.AA23221@constantine.sybase.com>
To: rosenne@NetVision.net.il, Harald.T.Alvestrand@uninett.no, manuel.carrasco@emea.eudra.org
Cc: Converse@sesame.demon.co.uk, i18n@dkuug.dk, xojig@xopen.co.uk, sc22wg14@dkuug.dk, www-international@w3.org, wgi18n@terena.nl, keld@dkuug.dk
[Mike]

Unfortunately, the 639 language code does not cover regional
differences, for instance between US English and International
English.  This may not be that big of a problem with regards the target
language, but it may make a difference when choosing the source
language.

In Russian, for instance, is the source language White Russian,
or contemporary Russian?  And what script is it in?
Serbo-Croation is commonly written using a latin script in Croation
areas, but a cyrillic script in Serbian areas.

It might look a little ugly, but the X-windows font specifier strings
may be of some use as a starting template.  Perhaps something like:


t-<sl>-<sd>-<ss>-<tl>-<td>-<ts>

Where:
	sl - source language, using ISO 639
	sd - source dialect, perhaps using ISO 3166 (I know, even this
		has defieciencies)
	ss - source script - (we'll script identifiers)

	tl - target language
	td - target dialect
	ts - target script

Any value can be a default or wild card.  So,
	French transliterated into Hebrew  = t-fr-*-*-iw-*-*
	French transliterated into Russian = t-fr-*-*-ru-*-*
	Russian transliterated into Serbo-Croation in a latin script
					= t-ru-*-cy-sh-*-cy

		where 	cy = cyrillic
			la = latin

This may be overkill, but we do need some sort of modifier part for
regional differences.

My $0.02,

	Mike____

> [Carrasco 1]
> > >Transliteration should be coded in RFC 1766 (Mr. Alvestrand ?).
> > >
> > >For example:
> > >
> > >  t-xx
> > >
> > >where
> > >  t   : transliteration
> > >  xx : a 639 language code
> > 
> > [Rosenne]
> > A second argument is needed: the language into which the text is
> > transliterated. Obviously, French transliterated to Hebrew is
> > different
> > from French transliterated into Russian.
> > 
> > [Carrasco 2]
> > 
> > So one needs to code:
> > 
> >  -  t    :  transliteration indicator
> >  - ss  :  a 639 language code ; source language (language
> > transliterated from)
> >  - tt    :  a 639 language code ; target language  (language
> > transliterated into)
> > 
> > Examples:
> >  French transliterated into Hebrew  = t-fr-iw
> >  French transliterated into Russian = t-fr-ru
> > 
> > Questions:
> >   - Any other parameters needed to be coded ?
> >   - Does this breaks RFC 1766 ?
> > 
> > Regards
> > Tomas
> > 
> 
Received on Tuesday, 18 November 1997 19:34:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:48 GMT