RE: International standards for various forms of profile data

Carl,

The categorizing is the toughest part, but luckily for now I can focus on 
ecommerce, where we have a lot of p13n (personalization) issues to deal 
with, but all can be classified in one way or the other. Even though your 
are correct that German girls (das Mädchen) and unmarried women (das 
Fräulein) are grammatically neuter, we can still classify them as females 
in ecommerce profiles. I don't even want to get into any historical 
debates on language and culture evolution here. But there are actually 
reasons to include neuter and even more for gender when you think about 
it. Let me use two extreme examples of actual growing ecommerce markets.

The alternative lifestyle ecommerce market is fast-growing world-wide for 
many reasons. The internet is discrete and can cater to 
countries/states/communities where no retail stores are available, either 
because they are illegal, undesired, or the market is too small. Here we 
can extend the definition of gender, as I have discovered researching 
ecommerce applications. Neuter (n), transvestite (tv), transsexual (ts), 
homosexual (h or hf and hm), bisexual (bi), hermaphrodite (h), etc. These 
"genders" are used to identify product categories or group profiles for 
marketing purposes and for personals. I have seen these customer 
requirements and have seen it hosted this way.

Religious and cultural markets are growing fast as well. In some extreme 
cases there might only be valid entries for male or for female and no 
choice, except maybe neuter. I recently stumbled across this issue, I am 
still researching it. Only the dominant gender has rights to express their 
gender, the other has no gender or no rights to own a gender.

These two examples are extreme and might seem far-fetched, at least by our 
Western standards, but we mustn't ignore the fact that each category has 
globalization and/or personalization quirks. While training others here I 
often use examples including nonhuman aspects, using animals and aliens as 
examples for globalization. (What a nightmare developing an ecommerce site 
for Klingons would be! Well, at least we have the Unicode Klingon 
character set ready for that task ....)

The protection of personal information on the net is another area where 
the neutral gender is useful, just like many people avoid using their 
first names in telephone books so gender information cannot be revealed by 
their first name. Plus there are possible groups that would not allow or 
should avoid that information to be entered at all. Think about children 
and the online dangers. And it cures a common software developer problem: 
Which gender should be used as default? What is politically correct, what 
not? US sites normally have male as default, but I noticed that many 
similar German sites have female as default. It would be interesting to 
define which gender should be default for any given locale, at least to be 
politically correct. I might include that information in my results.

Our solution for the addresses is that we are implementing the option to 
enter each address twice, one in international post format using ASCII, 
and the other in locale-specific address format and character set. Locale 
specific address formats are defined via regular expressions. The country 
name is always in the sender's language or in both languages, the address 
layout is in the recipient's format, and if sent in the same locale, 
optionally using the locale specific character set. Not all details have 
been ironed out, yet, but we are getting close. We maintain all country 
names in all languages ourselves, as the country names returned by Java 
are not complete in all languages for all locales, as we recently found 
out. 

David 




"Carl W. Brown" <cbrown@xnetinc.com>
08/30/01 02:59 PM

 
        To:     <David_Possin@i2.com>, "Kremena Gotcheva" <infom@bcci.bg>
        cc:     <www-international@w3.org>
        Subject:        RE: International standards for various forms of profile data


David,
 
Part of the problem is categorizing the data you need.  For example you 
categorize gender for English and German.  To be complete you might want 
to consider that girls in German are neuter(I don't know of other 
languages that do this) .
 
Tokenizing addresses for say US, Russian and Japanese addresses is not a 
simple task.   Besides a letter for say Venezuela to me would be:
 
Carl W. Brown
X.Net, Inc.
3452 Shangri-La Rd.
Lafayette, CA 94549
EE.UU.A
 
International mail requires hybrid addresses.  Not an easy task.
 
Carl
 
 
-----Original Message-----
From: www-international-request@w3.org 
[mailto:www-international-request@w3.org]On Behalf Of David_Possin@i2.com
Sent: Thursday, August 30, 2001 12:32 PM
To: Kremena Gotcheva
Cc: www-international@w3.org
Subject: Re: International standards for various forms of profile data


I will be doing what I can for en_US and de_DE, these being my 2 native 
locales. Currently I am just collecting whatever data I can find in the 
areas I listed below. I plan to design a definitive requirement for our 
March SW release by mid-December, that should give me enough time for a 
good research and compilation of data. I do plan to host the results on a 
web site when they are finished, notifying the members of this list of its 
presence. I might even design it so that members can update the tables for 
their locale online. 

The tables will look like this: 

table                                lang_table 
data_id        data_type                 locale                data_std   
data_long        data_short        data_description 

0001        gender                        en_US                male   male 
               m                male gender 
                                de_DE                männl. männlich  m    
      männliches Geschlecht 
0002        gender                        en_US                female    
female                f                female gender 
                                de_DE                weibl. weiblich     w 
               weibliches Geschlecht 

Not all fields might be required for each instance, but so far I was able 
to store all locale specific information in this format. 

Thanks for your cooperation, Kremena 

- Dave 




"Kremena Gotcheva" <infom@bcci.bg> 
08/30/01 01:08 PM 
        
        To:        <David_Possin@i2.com> 
        cc:         
        Subject:        Re: International standards for various forms of 
profile data



OK, I can do my best as far as Bulgarian is concerned but please share the 
results, e.g. publishing on this mailing list or in a personal mailing. I 
am also working on a project that is likely to involve a lot of i18n. 
  
How soon do you need the data? Can you mail me a 'dummy' list of all 
details you have so far found out you need, to give me something to think 
on? 
  
Much success, 
Kremena Gotcheva 
Bulgarian Chamber of Commerce an Industry 
----- Original Message ----- 
From: David_Possin@i2.com 
To: 
Hello everybody, 

I am currently working the i18n/l10n requirements for creating and 
maintaining company and person profile data for our ecommerce framework. 
Due to the large diversity of our customers we want to offer standard base 
data for international support in all supported languages. This was 
originally done by our customer in their respective local offices, but the 
magnitude is getting overwhelming and our support staff cannot keep up 
with their requests. 

I am looking for resources for specific profile data information in as 
many languages/locales/cultures/regions as possible, preferably backed up 
by national standards organizations. Here are the major areas I have 
started to research: 
Common Honorifics, Salutations, Titles; their placement rules and correct 
usage (regular expressions) 
Gender and Marriage Status specifications (some countries do not allow 
divorces, but may allow the term separated or legally separated) 
Surnames and Additions (like Henry III or the Third) 
Usage of Maiden Names and other genealogical information (I remember 
Spanish heritage listing as a nightmare: uuuu y vvvv y wwww y xxxxx and so 
on) 
Religions and Sub-Groups (like Protestant, Lutheran or Reformed) 
Legal business identifications (like in Germany: AG, KG, GmbH, GmbH & Co. 
KG, etc.) 

There are probably dozens more parts of information that should be used to 
be complete, there are probably additional fields that don't even apply to 
a Western culture. I would appreciate getting as much info as I can, I 
will compile a list and send it out to the group once it looks like I have 
all. I am especially interested in native standards in the respective 
languages
Thanking all in advance, 

David Possin
International QA Engineer (i18n & l10n)
i2 Technologies - Austin 

Received on Thursday, 30 August 2001 17:36:07 UTC