W3C home > Mailing lists > Public > public-i18n-geo@w3.org > June 2003

Re: Q&A: Initial considerations for international web sites

From: Martin Duerst <duerst@w3.org>
Date: Mon, 16 Jun 2003 18:13:02 -0400
Message-Id: <4.2.0.58.J.20030616180922.04407108@localhost>
To: "Arko, Phil" <phil.arko@scr.siemens.com>, "'public-i18n-geo@w3.org'" <public-i18n-geo@w3.org>

I think this is a very good start.

But I think first we have to think about what "international web site"
means. It could be:

- Site in one language, but for an international audience
- Multilingual site (and there are various ways a site can be
   multilingual)
- Site in a language other than English (?)
- A 'secondary' site in that its content is translated/adapted from
   another one
- One of multiple sites with coordinated content in different languages

The issues and considerations, and the answers, are different
for different cases.

Regards,    Martin.

At 19:47 03/06/11 -0400, Arko, Phil wrote:

>Below is the revised Q&A
>
>I have taken out references to codes and markup languages in the main
>sections of this Q&A. Because this is meant to act somewhat like an
>introduction to our area, I felt that it was important to include some
>mention of these in order to provide the reader with suggested next steps
>(those being to learn a little more about each of the standards mentioned).
>I discussed them briefly under "Further information."
>
>Thanks,
>Phil
>
>
>----------------------------------------------------------------------------
>-
>Questions & Answers:  Initial considerations for international web sites
>
>
>Question
>
>What are some topics to consider when creating websites for an international
>audience?
>
>
>Background
>
>People from around the world can view your content on websites. Because much
>of what we find on the web is written with a specific demographic in mind,
>it is often the case that people outside of that demographic misunderstand
>what has actually been intended. The formatting and presentation of text has
>very specific regional and cultural requirements that need to be addressed
>if the content is to be properly understood.
>
>
>Answer
>
>A typical challenge is to ensure that characters display correctly for the
>end user. Web pages can easily accommodate English, Germanic, and Romance
>languages, but what happens when an occasional foreign word or name is used?
>In the past, a quick solution was to use an inline graphic to display the
>character. Another method was to copy and paste the desired character from
>another program into the web page. While the result might look correct for
>one user, there is no guarantee that every user will see the same text.
>There are many variables that might need to be considered, such as the font,
>operating system, browser software, etc. These concerns are becoming
>increasingly important as users move toward mobile and other non-standard
>browsing devices.
>
>As many languages read from right to left, the ability to include such
>content becomes an even greater challenge. In addition to identifying the
>proper characters, there also needs to be a method of properly handling this
>text.
>
>Some cultures use a comma as a thousands separator and a period as a decimal
>point, while other cultures use the period and comma, respectively. For
>example, 1,547 in Germany and 1.547 in the United States are actually the
>same number. While the only difference in this example is a single
>character, the difference in meaning is significant.
>
>The presentation of dates and times are a very typical example of something
>that causes confusion for the user. When using two digits each to represent
>year, month, and day, the actual date might not be obvious. A few examples
>from different cultures include DD/MM/YY, MM/DD/YY, and YY/MM/DD. A single
>date in the format "xx/xx/xx" could be interpreted as three different dates.
>
>There are many other concerns that should be addressed as well when creating
>an international-friendly site. This is only a sampling of some of these.
>
>
>By the way...
>
>In its simplest definition, "internationalization" refers to creating a site
>framework that allows for content to be presented in a way that is
>consistent with regional styles and cultural customs. "Localization" refers
>to the actual implementation of each specific region's content into the
>international framework. Internationalization is commonly referred to as
>"i18n" because there are 18 characters between the beginning "i" and
>concluding "n." Similarly, localization is commonly referred to as "l10n."
>
>When starting to create an internationalized site, one must first give
>consideration to the various locales that need to be considered. This will
>help to define the requirements for the international framework. It is
>highly recommended to work with native speaking people who are very familiar
>with the regions and cultures that are part of your user demographic.
>
>Most importantly, the end user must understand that a page has been
>localized. It is a good practice to indicate or imply that the content has
>been formatted for their local formats. This avoids questions and possible
>misinterpretations.
>
>
>Further information
>
>This Q&A provides only a few introductory points on this topic. There are
>many books devoted to the topics of internationalization and localization.
>Becoming familiar with the styles and customs of other regions and properly
>implementing these elements into a web site will ensure that content is
>available to -- and truly understandable by -- a larger audience.
>
>Some of the standards typically used to create internationalized web sites
>include the following:
>
>- XML [ www.w3.org/XML ] is the preferred markup language for defining
>content. In addition to identifying the actual content, it can also include
>attributes that further define aspects of the content (such as language,
>grammar style, and current format of the content). Other web languages (such
>as XHTML) use these attributes to deliver the localized page appropriate for
>the current user.
>
>- XHTML [ www.w3.org/MarkUp ] is the successor to HTML, and is a markup
>language used to define web pages and  to properly format and display XML
>content within them.
>
>- Unicode [ www.unicode.org ] is a numbered collection of the characters of
>all of the languages in the world. Using this standard ensures that the
>correct character will be displayed, regardless of the browser or system.
>
>Properly utilizing these standards in a web site can ensure that the
>concerns mentioned above are properly handled.
Received on Monday, 16 June 2003 18:26:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:37 GMT