W3C home > Mailing lists > Public > www-international@w3.org > July to September 2003

Major Enhancements to the Unicode Standard: Enabling International Domain Names, Expanding Worldwide Accessibility, and Reducing the Digital Divide

From: Magda Danish (Unicode) <v-magdad@microsoft.com>
Date: Wed, 27 Aug 2003 10:37:50 -0400
Message-Id: <>
To: www-international@w3.org

Major Enhancements to the Unicode Standard:
Enabling International Domain Names, Expanding Worldwide Accessibility, and 
Reducing the Digital Divide

Mountain View, CA, August 27, 2003 -- The Unicodeョ Consortium and 
Addison-Wesley announce publication of Version 4.0 of the Unicode Standard. 
Unicode is the fundamental specification for the representation of text, at 
the core of all modern software, programming languages, and standards, 
including Windows, Java, C#, Perl, XML, HTML, DB2, Oracle, and many others.

Unicode is also central to the new internationalized domain names, which 
allow everyone in the world to have URLs in their own languages. This is 
yet another case where Unicode opens the door to more of the world's 
different cultures, helping to break down the digital divide.

Version 4.0 strengthens Unicode support for worldwide communication, 
software availability, and publishing. The text has been extensively 
rewritten, and incorporates specifications that were previously only 
available as separate documents. The clarified specification of conformance 
requirements incorporates the most highly developed character encoding 
model in existence, encompassing the wide variety of types of characters 
needed by the world's languages, and permitting compatibility with all 
modern computer architectures.

Record-breaking character content

Version 4.0 encodes over 96,000 characters, twice as many as Version 3.0, 
and includes two record-breaking collections of encoded characters. The 
largest encoded character collection for Chinese characters in the history 
of computing has doubled in size yet again to encompass over 2000 years of 
Chinese, Japanese, Korean, and Vietnamese literary usage, including all the 
main classical dictionaries of these languages. Version 4.0 also encodes 
the largest set of characters for mathematical and technical publishing in 
existence. The character repertoires of Version 4.0 and International 
Standard ISO/IEC 10646 are fully synchronized.

Reducing the digital divide

To meet the needs of all linguistic communities, the Unicode Standard and 
associated standards are continually being extended, not only in terms of 
the addition of characters, but also in specifying *how* those characters 
work, such as:

- how text sorts or matches in different languages
- how text behaves for East Asian languages (e.g. vertically) or in Middle 
Eastern languages (from right to left)
- how text should upper- or lowercase
- how text breaks into lines or words
- how text behaves in Regular Expressions (a key tool used in a vast number 
of web servers)

Small linguistic communities all over the world have the opportunity to get 
mainstream software working right out of the box, instead of waiting years 
for special adaptations that may never come.

For more information on the scripts encoded in the Unicode Standard, see 

Version 4.0 is published by Addison-Wesley (ISBN 0-321-18578-1), and is 
available from the Unicode Consortium or through the book trade. The text 
and code charts of Version 4.0 are also available on the Consortium's Web 
site www.unicode.org.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, 
extend and promote use of the Unicode Standard, which specifies the 
representation of text in modern software products and standards.

Members of the Consortium are a broad spectrum of corporations and 
organizations in the computer and information technology industry. Full 
members are: Adobe Systems, Apple Computer, Basis Technology, Government of 
India (Ministry of Information Technology), Government of Pakistan 
(National Language Authority), HP, IBM, Justsystem, Microsoft, Oracle, 
PeopleSoft, RLG, SAP, Sun Microsystems, and Sybase.

Membership in the Unicode Consortium is open to organizations and 
individuals anywhere in the world who support the Unicode Standard and wish 
to assist in its extension and implementation.

For additional information on Unicode, contact the Unicode Consortium, 

About Addison-Wesley

Addison-Wesley (www.awprofessional.com) is the leading publisher of quality 
computer science and engineering books and software for technical 
professionals, developed and authored by the world's leading technology 
experts.  It is a unit of Pearson Technology Group, the world's largest 
provider of consumer and professional computer, information technology, 
engineering and reference content.  Pearson Technology Group is an 
operating unit of Pearson Education, the world's leading educational 

Pearson Education is part of Pearson plc (NYSE: PSO), the international 
media company.

For more information on The Unicode Standard, Version 4.0, see:

Corporate Sales and Press Information:
Heather Mullane
Received on Wednesday, 27 August 2003 11:17:51 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:23 UTC