Re: Internationali?ation

Bernard Chester wrote (to and

> That's because the focus is wrong.  Its not codepages, nor languages
> that you focus on when doing Localization, its locales.
> A locale captures all of these cultural expectations.  A locale has a
> specific language dialect, and conventions like number and currency
> display.
> FR is a language (ISO 3316); ca is a country (ISO 639); FR-ca is a
> locale designator (this is a different French than used in Paris!)

Indeed, that is a particular model, though not a very useful one in the 
context of the *World Wide* Web, as it mixes levels.

Data should be encoded using universal schemes: Text should be encoded 
using the Universal Character Set (Unicode), tagged with language 
information to enable operations such as conversion to speech, hyphenation, 
line breaking, spell-checking, culturally-sensitive glyph-selection and so 
on.  Numerical data, such as dates, amounts of money, etc, should be 
encoded using the appropriate canonical form, for instance the ISO 8601 
standard for dates, the ISO 4217 standard for currency codes and so on.

The *locale* comes into its own in the realm of user preferences, for both 
input and output.  Users may want to see dates displayed as:


The stored date is none of these, but is rather: YYYY-MM-DD.

  Misha Wolf            Email:     85 Fleet Street
  Standards Manager     Voice: +44 171 542 6722           London EC4P 4AJ
  Reuters Limited       Fax  : +44 171 542 8314           UK
12th International Unicode Conference, 8-9 Apr 1998, Tokyo,
   7th World Wide Web Conference, 14-18 Apr 1998, Brisbane,

Any views expressed in this message are those of the individual  sender,
except  where  the  sender  specifically  states them to be the views of
Reuters Ltd.

Received on Friday, 17 October 1997 14:55:36 UTC