Re: locales

Because my products don't generally work that way.  Data is locale-formatted
strictly for presentation to the user, and not for sharing with other program
processes.  I wouldn't expect cross-platform understanding of locale formats
anymore than I would expect it of other proprietary formats, like Microsoft
Word, for example.  If I want to make sure a Word file is readable
cross-platform, I save it into HTML, or plain text. (Please don't point out the
quirks with Word's format, I'm well aware, but that's not the point here.)

Think of it as analogous to using Unicode internally vs. native charsets at the
point of presentation.  What we need is a common internal representation for
data destined for locale-based formatting at the presentation layer, with
"converters" or rather interpreters available from the presentation software.

Andrea

Tex Texin wrote:
> 
> Andrea,
> I guess I don't understand why you don't care with the examples you
> gave.
> 
> If you send me a file and tell me you created it with locale ko_KR (or
> it is saved with that information) and then I used the same application
> on another platform to read the file, and I parse the date incorrectly,
> that's a problem.
> Users don't switch platforms but the data does...
> 
> You could argue that the data should be sent in a locale-independent
> format, or that locale shouldn't be used to describe date formats in
> documents, just user preferences, but that is why I think we should
> discuss the scope of locale...
> 
> tex
> 
> "A. Vine" wrote:
> >
> > I'm with Thierry on this one, and I'd like to add the following:
> >
> > One of my coworkers has been involved in the attempt to standardize locale
> > formats via the national standards bodies (see
> > http://anubis.dkuug.dk/cultreg/registrations/chreg.htm for some results).  THis
> > didn't get very far very fast, because *within each individual national
> > standards committee the members could not agree on one definition!*
> >
> > This is why the platforms essentially had to come up with the information via
> > their own research, and why they don't agree with each other.
> >
> > Personally, I would be happy if the locale ids were standardized across
> > platforms, and that they covered the same categories.  I don't care whether the
> > actual formats for a particular locale change from platform to platform, so long
> > as when I provide the id "ko_KR", I know that I will get date formats, time
> > formats, numeric formats, a default currency if necessary, etc. etc.  I don't
> > care if Windows 2000 formats the ko_KR date as 01.11.08 while the Solaris 8
> > format is 08.11.01 and the Mac date is 2001-11-08.  Users don't switch
> > platforms, and they become accustomed to the defaults on their platform.  If
> > they're unhappy, they complain to the platform folks, and those poor
> > unfortunates have to manage this problem.  As an application developer, I just
> > want to pass ko_KR and get date formats, and only have to provide a date
> > conforming to a particular format (using milliseconds from a certain point or
> > UTC with a particular order and syntax or whatever).
> >
> > The problems arise when:
> >
> > 1.  The locale id is not understood.
> > 2.  The particular formatting or info is not available as part of the standard
> > locale definition on some platforms, but is there on others.
> > 3.  The formatting behaves differently, e.g. currency is automatically tacked
> > on, rather than providing an option (but then this is probably incorrect
> > behavior, and so is really a bug.)  There may be some better examples.
> > 4.  Applying the locale-specific formatting/data requires different input
> > formats from platform to platform, so that for dates, Solaris requires
> > millisecond from a given point in time wheres Windows requires a UTC date/time
> > stamp in the format YYYYMMDDhhmmss.sss or some such. (I'm not saying they do,
> > this is just blue sky example, please don't send me corrections.)
> >
> > Those are the problems I'd like to solve.  I wouldn't mind seeing phone numbers
> > and address formats added to the platform locale info, either.  But it's not
> > dire.  We simply add a user preference field with a few choices for those
> > elements not in the locale definition.
> >
> > However, I'm with Mark Davis in that I'd like to see a standard for passing user
> > preferences, probably in XML format.
> >
> > Andrea
> >
> > Thierry Sourbier wrote:
> > >
> > > While I fully understand the limitation of locales as they are currently
> > > defined, I'm very doubtful that the situation can be improved in a near
> > > future, given that:
> > >
> > > 1. It is hardly possible to define *scientifically* what is a locale. Even
> > > the candidates for the *base* have shaky definition (e.g. language,
> > > region -why country?-, time zone, ...).
> > >
> > > if we pass this hurdle:
> > >
> > > 2. It is hardly possible to decide what is a *valid* locale (This is where
> > > David started). Shall we base it on the number of people it targets? In that
> > > case for example a locale such as sp_US (22 million people) should be *more
> > > valid* than fr_CA (7 million people). How can we prevent the lurking
> > > combination explosion? Some quick maths show that technically there are more
> > > locale candidates than character candidates for Unicode (dooh!).
> > >
> > > if we pass this hurdle:
> > >
> > > 3. It will be impossible for each application to support ALL valid locales.
> > > Then how the fall back mechanisms should work? Say that the sp_US locale is
> > > not present in my system, shall I default to Spain Spanish or English US? I
> > > guess you will say a bit of both... (side question then, how to prevent Mr
> > > QA guy from going postal?)
> > >
> > > if we pass this hurdle:
> > >
> > > 4. As Tex pointed out it is not even obvious what locales are to be used
> > > for. Some candidates include Selecting the content to display, formatting
> > > rules, collation rules, time zone, calendar, address format, units of
> > > measure, currency (shall we limit to one?)  but I'm sure we can find much
> > > more (e.g. basic privacy rules, sales tax information, ...).
> > >
> > > and last but not least:
> > >
> > > 5. It won't be an easy thing to make it simple to use, so at least people be
> > > tempted to look at it. How to make it a stantard so our locales will be
> > > portable to all platform? Shall a "Unilocale consortium" be created :).
> > >
> > > The point of these questions is certainly not to get answers, but to show
> > > that without a given application framework it is impossible to get a closure
> > > on this topic. Sorry if this is bad news for some but I don't really see how
> > > custom coding could be avoided  in the forseeable future for application for
> > > which the current locales are not enough (this is what I believe trigered
> > > this entire discussion).
> > >
> > > Don't take me wrong, I'm all for a better world but to join Martin Duerst
> > > comment, rather than critizing current models why not present ideas on how
> > > they could be improved? For those who have implemented their own solutions,
> > > why not make them into an open source project (Universal Locale Components?)
> > > to try to get it to become a de-facto standard like tz? - I'll be the first
> > > to advertise it-.
> > >
> > > My 2 Euro cents,
> > >
> > > Thierry.
> > > (who moved back to France to see the Euro mess first hand :).
> > >
> > > <><><><><><><><><><><><><><><><><><><><><><>
> > > www.i18ngurus.com - Open Internationalization Resources Directory
> > >
> > > ----- Original Message -----
> > > From: "Tex Texin" <texin@progress.com>
> > > To: "Carl W. Brown" <cbrown@xnetinc.com>
> > > Cc: <www-international@w3.org>
> > > Sent: Thursday, November 08, 2001 1:07 AM
> > > Subject: Re: locales
> > >
> > > > Thanks Carl.
> > > >
> > > > I take this to mean that you are proposing that the language, country,
> > > > character set, time zone, and variant, represent 5 orthogonal attributes
> > > > which uniquely describe a "locale" and which are sufficient to describe
> > > > a user.
> > > >
> > > > I think I would like "variant" to go away, or at least not be required
> > > > to meet most needs.
> > > > I know it is used for Euro, I am not sure what other general purpose
> > > > usages it has.
> > > >
> > > > I wonder if we should add currency to your list of orthogonal values.
> > > >
> > > > Also, I note that language, country, and time zone are not sufficient to
> > > > determine which calendar is being used.
> > > > Perhaps timezone should be replaced with something representing
> > > > calendar+date+time formats and timezone?
> > > >
> > > > I am not sure what to say about possibly "invalid" combinations such as
> > > > euro currency and ISO 8859-1 character set (since it doesn't have the
> > > > euro symbol)...
> > > >
> > > > Perhaps this leads us to defining locale as a collection of names for
> > > > formats associated with basic datatypes-
> > > > (text, calendar, currency...)
> > > >
> > > > It then becomes more precise, but less useful as an easy to use
> > > > nomenclature...
> > > >
> > > > tex
> > > >
> > > > "Carl W. Brown" wrote:
> > > > >
> > > > > Tex,
> > > > >
> > > > > In xIUA I use the following format:
> > > > >
> > > > >      Format: (no spaces)
> > > > >      ll[_CC ][.MM ][@VV][#TT]
> > > > >
> > > > >      ll = lang, CC = ctry, MM = charmap, VV = Variant, TT = Time Zone
> > > > >
> > > > > For example:
> > > > >
> > > > > en_US.iso-5589-1#America/Los_Angeles
> > > > >
> > > > > or
> > > > >
> > > > > fr_FR.iso-5589-15@EURO#Europe/Paris
> > > > >
> > > > > It works well with ICU.  The conversion both ways is very simple and
> > > > > straight forward.
> > > > >
> > > > > Carl
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Tex Texin [mailto:texin@progress.com]
> > > > > > Sent: Wednesday, November 07, 2001 11:54 AM
> > > > > > To: David_Possin@i2.com
> > > > > > Cc: cbrown@xnetinc.com; www-international@w3.org;
> > > > > > www-international-request@w3.org
> > > > > > Subject: locales
> > > > > >
> > > > > >
> > > > > > David,
> > > > > >
> > > > > > If you would set up an archived forum, that would be great. It will
> > > save
> > > > > > me trying to identify which messages are relevant and saving them all
> > > on
> > > > > > my drive.
> > > > > >
> > > > > > Mentioning time zones will, I am sure, insure a blast from Carl. (;-)
> > > I
> > > > > > look forward to it.)
> > > > > > One point is that a locale may include more than one zone (e.g. US
> > > goes
> > > > > > from EST, CST PST) so is ambiguous, and we may go down the trail of
> > > the
> > > > > > changes to daylight savings time may vary within a locale.
> > > > > >
> > > > > > A key question for me is which of the many variables for
> > > > > > internationalization belong in a locale and which belong in some other
> > > > > > structure?
> > > > > >
> > > > > > Maybe time and calendar should not be a function of locale...
> > > > > > Maybe currency should not be.
> > > > > >
> > > > > > Which variables are best associated with the locale, which with the
> > > > > > data, and which with the application?
> > > > > > For example, since I develop database products, and I cannot have
> > > > > > indexes changing on me, I always include the rules for sorting in the
> > > > > > database, with the data.
> > > > > >
> > > > > > I don't generally worry about hyphenation, I would probably keep rules
> > > > > > for that with the application (the choice being influenced but not
> > > > > > defined by locale).
> > > > > >
> > > > > > tex
> > > > > >
> > > > > >
> > > > > >
> > > > > > David_Possin@i2.com wrote:
> > > > > > >
> > > > > > > I would propose to open a discussion forum for locales in the
> > > > > > > yahoo.groups like many other globalization people have done for
> > > other
> > > > > > > issues. It will be tough keeping up to date with all the threads
> > > > > > > starting to pop up, and all are extremely important to me and my
> > > job.
> > > > > > > Here are the issues I have been trying to monitor and even reply to,
> > > > > > > adding my 2 cents:
> > > > > > >
> > > > > > >   1. Locale definition - what is a locale?
> > > > > > >   2. Locale identification - how many parameters are needed for a
> > > > > > >      default minimal locale description?
> > > > > > >   3. Language identification - how can we identify languages that
> > > are
> > > > > > >      not included in the ISO 639 language group standard? (Current
> > > > > > >      locale identifiers use the 2-letter code, not the 3-letter
> > > code)
> > > > > > >   4. Time zones - There is no standard, the tz database is as close
> > > as
> > > > > > >      I can get to a standard and it is not officially tied to a
> > > > > > >      locale. This only touches the need for a standard global time &
> > > > > > >      date display.
> > > > > > >   5. Currencies - Locales have only one currency tied to them, and
> > > > > > >      European locales still all have their national currencies
> > > > > > >      implied.
> > > > > > >   6. Euro - The big problem is not the display, but how to use it.
> > > The
> > > > > > >      EC has strict requirements on how to do currency triangulation
> > > > > > >      with the euro. We discovered that rounding problems popped up
> > > > > > >      everywhere, especially when using euro precision for
> > > calculation
> > > > > > >      and had to display the value in a currency without decimals. It
> > > > > > >      would be a dream to have this in ICU.
> > > > > > >   7. Even when the euro becomes standard for a country, older
> > > > > > >      transactions will still have to be working with old currencies
> > > > > > >      and/or triangulation. We can't just convert them.
> > > > > > >
> > > > > > >      This only lists what has been mentioned in the last few days,
> > > > > > >      there is much more to be mentioned. I am trying to make PMs,
> > > > > > >      Devs, QA, etc globally aware here, but it is very hard to get
> > > > > > >      official requirements written up when there are no standards I
> > > > > > >      can show as reference.
> > > > > > >
> > > > > > >      And my biggest proposal is to break the tie between language
> > > and
> > > > > > >      country when selecting a locale.
> > > > > > >
> > > > > > >      Dave
> > > > > > >
> > > > > > >       "Tex Texin" <texin@progress.com>
> > > > > > >       Sent by:                                   To:        "Carl W.
> > > > > > >       www-international-request@w3.org   Brown" <cbrown@xnetinc.com>
> > > > > > >                                                  cc:
> > > > > > >       11/07/01 12:15 PM                   www-international@w3.org
> > > > > > >                                                  Subject:        Re:
> > > > > > >                                          Euro mess (Was: valid
> > > > > > >                                          locales ---> was  bilingual
> > > > > > >                                          websites
> > > > > > >
> > > > > > >      Carl,
> > > > > > >
> > > > > > >      I hope the locales issue doesn't fan out into thousands of
> > > other
> > > > > > >      threads, I won't be able to track them.
> > > > > > >
> > > > > > >      With respect to the Euro, there are several different issues.
> > > > > > >
> > > > > > >      a) Of course the Euro is important and having proper support
> > > for
> > > > > > >      the
> > > > > > >      Euro is required.
> > > > > > >
> > > > > > >      b) ISO 8859-15 does not seem to be getting much adoption, which
> > > > > > >      is a
> > > > > > >      good thing. Since 8859-15 and 8859-1 are incompatible, and if
> > > you
> > > > > > >      adopt
> > > > > > >      8859-15 you likely still need to interchange text with users of
> > > > > > >      8859-1,
> > > > > > >      (as they both support the same languages more or less), the
> > > world
> > > > > > >      would
> > > > > > >      be a very difficult if there was a lot of adoption of -15.
> > > > > > >
> > > > > > >      Anyone considering -15, should instead be considering Unicode.
> > > > > > >
> > > > > > >      And there are other alternatives if the only requirement is to
> > > > > > >      support
> > > > > > >      the Euro character and continue with a single byte codepage.
> > > > > > >      Spelling out "Eur" or "Euro" is acceptable if there is space.
> > > And
> > > > > > >      inventing mechanisms (e.g. escape sequences, or other
> > > specialized
> > > > > > >      encodings) to print the Euro symbol are also possible.
> > > > > > >
> > > > > > >      c) The issue relative to locales, is there is no standard
> > > > > > >      handling for
> > > > > > >      the Euro. So my understanding is some software will change the
> > > > > > >      currency
> > > > > > >      of their European locales from native monetary units to Euro on
> > > > > > >      Jan. 1.
> > > > > > >      This may be useful for some, but will likely break many
> > > > > > >      applications as
> > > > > > >      well.
> > > > > > >
> > > > > > >      Others will create new locales specific to the Euro and/or
> > > > > > >      specific to
> > > > > > >      the old native currency. But which nomenclature you use when
> > > you
> > > > > > >      are
> > > > > > >      integrating software with different technologies and different
> > > > > > >      locale
> > > > > > >      naming conventions is a mystery to me.
> > > > > > >
> > > > > > >      So now if I say fr_fr I do not know which currency I get and it
> > > > > > >      may
> > > > > > >      change from Dec 31 2001 to Jan 1 2002.
> > > > > > >      If I use an application that integrates technologies with
> > > > > > >      different
> > > > > > >      rules for locales, it could get very messy.
> > > > > > >
> > > > > > >      I presume reading monetary data created before 2002 may also be
> > > > > > >      interpreted differently after 2002.
> > > > > > >
> > > > > > >      And minor upgrades of software may in fact invoke these locale
> > > > > > >      changes,
> > > > > > >      so what should be a minor patch may in fact be a large change
> > > to
> > > > > > >      monetary handling.
> > > > > > >
> > > > > > >      d) I don't know why there isn't more of an outcry over this.
> > > > > > >      Maybe there
> > > > > > >      is a reason the problems I cite in (c) won't happen that I
> > > don't
> > > > > > >      understand. (I am by no means an expert on the subject. Most of
> > > > > > >      my own
> > > > > > >      software has explicit regional settings and doesn't follow the
> > > > > > >      locale
> > > > > > >      model.) It will be interesting to know what people find if they
> > > > > > >      change
> > > > > > >      their system clock to 2002 and do some application testing.
> > > > > > >
> > > > > > >      hth
> > > > > > >      tex
> > > > > > >
> > > > > > >      "Carl W. Brown" wrote:
> > > > > > >      >
> > > > > > >      > Tex,
> > > > > > >      >
> > > > > > >      > I wonder why no one seems to care about the Euro?  Are sites
> > > > > > >      going to
> > > > > > >      > continue to use iso-5589-1?  How many browsers and systems
> > > > > > >      support
> > > > > > >      > iso-5589-15?
> > > > > > >      >
> > > > > > >      > Carl
> > > > > > >      >
> > > > > > >      > > -----Original Message-----
> > > > > > >      > > From: www-international-request@w3.org
> > > > > > >      > > [mailto:www-international-request@w3.org]On Behalf Of Tex
> > > > > > >      Texin
> > > > > > >      > > Sent: Tuesday, November 06, 2001 7:42 PM
> > > > > > >      > > To: Martin Duerst
> > > > > > >      > > Cc: David_Possin@i2.com; Karl Ove Hufthammer;
> > > > > > >      www-international@w3.org
> > > > > > >      > > Subject: Re: valid locales ---> was Re: bilingual websites
> > > > > > >      > >
> > > > > > >      > >
> > > > > > >      > > Martin,
> > > > > > >      > >
> > > > > > >      > > You mean I can't just grouse and take potshots from the
> > > > > > >      sidelines? ;-)
> > > > > > >      > >
> > > > > > >      > > Well, I have not seen an alternative proposed and I don't
> > > > > > >      have one at
> > > > > > >      > > the ready, but I don't mind taking a shot at improving the
> > > > > > >      current
> > > > > > >      > > situation. However, I am crunching now thru the end of the
> > > > > > >      year, so I
> > > > > > >      > > will give it a go in the new year.
> > > > > > >      > > In the meantime, I would be happy to collect both
> > > suggestions
> > > > > > >      for
> > > > > > >      > > requirements and suggestions for solutions on this list or
> > > > > > >      privately.
> > > > > > >      > >
> > > > > > >      > > The new year should be interesting, as the switch to the
> > > new
> > > > > > >      Euro
> > > > > > >      > > currency will demonstrate some of the chaos with locales.
> > > > > > >      > >
> > > > > > >      > > tex
> > > > > > >      > >
> > > > > > >      > > Martin Duerst wrote:
> > > > > > >      > > >
> > > > > > >      > > > Tex - Could you write up (short), or point to, any
> > > proposal
> > > > > > >      > > > for how to do better than currently?
> > > > > > >      > > >
> > > > > > >      > > > Regards,  Martin.
> > > > > > >      > > >
> > > > > > >      > > > At 14:57 01/10/31 -0500, Tex Texin wrote:
> > > > > > >      > > > >David,
> > > > > > >      > > > >
> > > > > > >      > > > >FWIW, I thoroughly agree that locales as we currently
> > > > > > >      define and
> > > > > > >      > > > >implement them, do not work.
> > > > > > >      > > > >As a naming convention it is inadequate, and when you
> > > > > > >      select a
> > > > > > >      > > name, you
> > > > > > >      > > > >are not sure what behavior you will get.
> > > > > > >      > > > >
> > > > > > >      > > > >I have mentioned this before, and the response is always
> > > > > > >      "Yes, it's
> > > > > > >      > > > >broken, but it is the best we have at the moment.".
> > > > > > >      > > > >
> > > > > > >      > > > >It is rather unfortunate that we have this methodology
> > > > > > >      therefore, and
> > > > > > >      > > > >that it is accepted, since it won't be fixed as long as
> > > > > > >      this response
> > > > > > >      > > > >continues.
> > > > > > >      > > > >
> > > > > > >      > > > >tex
> > > > > > >      > > > >
> > > > > > >      > > > >--
> > > > > > >      > > >
> > > > > > >      >-------------------------------------------------------------
> > > > > > >      > > > >Tex Texin                    Director, International
> > > > > > >      Business
> > > > > > >      > > > >mailto:Texin@Progress.com    Tel: +1-781-280-4271
> > > > > > >      > > > >the Progress Company         Fax: +1-781-280-4655
> > > > > > >      > > >
> > > > > > >      >-------------------------------------------------------------
> > > > > > >      > >
> > > > > > >      > > --
> > > > > > >      >
> > > > -------------------------------------------------------------
> > > > > > >      > > Tex Texin                    Director, International
> > > Business
> > > > > > >      > > mailto:Texin@Progress.com    Tel: +1-781-280-4271
> > > > > > >      > > the Progress Company         Fax: +1-781-280-4655
> > > > > > >      >
> > > > -------------------------------------------------------------
> > > > > > >      > >
> > > > > > >
> > > > > > >      --
> > > > > > >      -------------------------------------------------------------
> > > > > > >      Tex Texin                    Director, International Business
> > > > > > >      mailto:Texin@Progress.com    Tel: +1-781-280-4271
> > > > > > >      the Progress Company         Fax: +1-781-280-4655
> > > > > > >      -------------------------------------------------------------
> > > > > >
> > > > > > --
> > > > > > -------------------------------------------------------------
> > > > > > Tex Texin                    Director, International Business
> > > > > > mailto:Texin@Progress.com    Tel: +1-781-280-4271
> > > > > > the Progress Company         Fax: +1-781-280-4655
> > > > > > -------------------------------------------------------------
> > > > > > "When choosing between two evils, I always like to try the
> > > > > > one I've never tried before."- -Mae West
> > > >
> > > > --
> > > > -------------------------------------------------------------
> > > > Tex Texin                    Director, International Business
> > > > mailto:Texin@Progress.com    Tel: +1-781-280-4271
> > > > the Progress Company         Fax: +1-781-280-4655
> > > > -------------------------------------------------------------
> > > > "When choosing between two evils, I always like to try the
> > > > one I've never tried before."- -Mae West
> > > >
> > > >
> 
> --
> -------------------------------------------------------------
> Tex Texin                    Director, International Business
> mailto:Texin@Progress.com    Tel: +1-781-280-4271
> the Progress Company         Fax: +1-781-280-4655
> -------------------------------------------------------------
> "When choosing between two evils, I always like to try the
> one I've never tried before."- -Mae West

Received on Thursday, 8 November 2001 15:55:31 UTC