- From: Arko, Phil <phil.arko@scr.siemens.com>
- Date: Wed, 11 Jun 2003 19:47:21 -0400
- To: "'public-i18n-geo@w3.org'" <public-i18n-geo@w3.org>
Below is the revised Q&A I have taken out references to codes and markup languages in the main sections of this Q&A. Because this is meant to act somewhat like an introduction to our area, I felt that it was important to include some mention of these in order to provide the reader with suggested next steps (those being to learn a little more about each of the standards mentioned). I discussed them briefly under "Further information." Thanks, Phil ---------------------------------------------------------------------------- - Questions & Answers: Initial considerations for international web sites Question What are some topics to consider when creating websites for an international audience? Background People from around the world can view your content on websites. Because much of what we find on the web is written with a specific demographic in mind, it is often the case that people outside of that demographic misunderstand what has actually been intended. The formatting and presentation of text has very specific regional and cultural requirements that need to be addressed if the content is to be properly understood. Answer A typical challenge is to ensure that characters display correctly for the end user. Web pages can easily accommodate English, Germanic, and Romance languages, but what happens when an occasional foreign word or name is used? In the past, a quick solution was to use an inline graphic to display the character. Another method was to copy and paste the desired character from another program into the web page. While the result might look correct for one user, there is no guarantee that every user will see the same text. There are many variables that might need to be considered, such as the font, operating system, browser software, etc. These concerns are becoming increasingly important as users move toward mobile and other non-standard browsing devices. As many languages read from right to left, the ability to include such content becomes an even greater challenge. In addition to identifying the proper characters, there also needs to be a method of properly handling this text. Some cultures use a comma as a thousands separator and a period as a decimal point, while other cultures use the period and comma, respectively. For example, 1,547 in Germany and 1.547 in the United States are actually the same number. While the only difference in this example is a single character, the difference in meaning is significant. The presentation of dates and times are a very typical example of something that causes confusion for the user. When using two digits each to represent year, month, and day, the actual date might not be obvious. A few examples from different cultures include DD/MM/YY, MM/DD/YY, and YY/MM/DD. A single date in the format "xx/xx/xx" could be interpreted as three different dates. There are many other concerns that should be addressed as well when creating an international-friendly site. This is only a sampling of some of these. By the way... In its simplest definition, "internationalization" refers to creating a site framework that allows for content to be presented in a way that is consistent with regional styles and cultural customs. "Localization" refers to the actual implementation of each specific region's content into the international framework. Internationalization is commonly referred to as "i18n" because there are 18 characters between the beginning "i" and concluding "n." Similarly, localization is commonly referred to as "l10n." When starting to create an internationalized site, one must first give consideration to the various locales that need to be considered. This will help to define the requirements for the international framework. It is highly recommended to work with native speaking people who are very familiar with the regions and cultures that are part of your user demographic. Most importantly, the end user must understand that a page has been localized. It is a good practice to indicate or imply that the content has been formatted for their local formats. This avoids questions and possible misinterpretations. Further information This Q&A provides only a few introductory points on this topic. There are many books devoted to the topics of internationalization and localization. Becoming familiar with the styles and customs of other regions and properly implementing these elements into a web site will ensure that content is available to -- and truly understandable by -- a larger audience. Some of the standards typically used to create internationalized web sites include the following: - XML [ www.w3.org/XML ] is the preferred markup language for defining content. In addition to identifying the actual content, it can also include attributes that further define aspects of the content (such as language, grammar style, and current format of the content). Other web languages (such as XHTML) use these attributes to deliver the localized page appropriate for the current user. - XHTML [ www.w3.org/MarkUp ] is the successor to HTML, and is a markup language used to define web pages and to properly format and display XML content within them. - Unicode [ www.unicode.org ] is a numbered collection of the characters of all of the languages in the world. Using this standard ensures that the correct character will be displayed, regardless of the browser or system. Properly utilizing these standards in a web site can ensure that the concerns mentioned above are properly handled.
Received on Wednesday, 11 June 2003 19:47:34 UTC