a new darft of the css selectors qa

Question

How do I apply different styles using CSS for different languages in a 
multilingual document?


Answer:

There are four ways to apply differnet styles to different languages in 
a multlingual documents:

1) a class or id selector
2) the :lang() psuedo-class selector
3) a selector that matches the value of an attribute
4) a selector that matches the beginning of a value of an attribute

Although CSS2 provides language specific selectors, these selectors are 
not widely supported by web browsers. It is necessary to use more 
generic CSS selectors for applying different styles for different 
languages within a XHTML/HTML document.

Presentation styles are commonly used to control chages in fonts, font 
sizes and line heights when language changes occur in the document.

<h3>Generic selectors</h3>
NNav, Moz, IE, Opera, others

The most efficient method is to use a CSS class or id selector. For 
example, the sentence "The Nuer language is called Thok Nath" could be 
marked up as:

<p>The Nuer language is <span lang="ssa" xml:lang="ssa" 
class="nuer">Thok Nath</span></p>

And a class .nuer could be defined in the stylesheet as

.nuer {font-style: italic; font-weight: bold;} or alternatively, as

span.nuer {font-style: italic; font-weight: bold;}

Likewise, the html segment:

<div xml:lang="en" lang="en" dir="ltr" class="pan">
 <p>It is polite to welcome people in their own language:</p>
 <ul>
  <li xml:lang="zh-CN" lang="zh-CN" class="zhs">欢迎</li>
  <li xml:lang="zh-TW" lang="zh-CN" class="zht">歡迎</li>
  <li xml:lang="el" lang="el">Καλοσωρίσατε</li>
  <li xml:lang="ar" lang="ar" dir="rtl" class="ar">اهلا وسهلا</li>
  <li xml:lang="ru" lang="ru">Добро пожаловать</li>
 </ul>
</div>

could have the following markup,

 li {list-style-type: none; line-height: 1.5em;}
 .pan {font-family: "Times New Roman"; }
 .ar {font-size: 1.2em; text-align: left;}
 .zht {font-family: PMingLiU,MingLiU;}
 .zhs {font-family: SimSum-18030;SimHei;}


<h3>The :lang() psuedo-class selector</h3>
Moz

For those interested, CSS2 provides the language psuedo-class selector 
:lang() and language attribute selectors to allow XHTML/HTML document 
authors to specify rules for langauge specific presentation..

The :lang() psuedo-class selector allows authors to specify rules that 
match languages. I could markup the sentence "The Nuer refer to 
themselves as Naath" as

<p xml:lang="en-AU" lang="en-AU">The Nuer refer to themselves as <span 
xml:lang="ssa" lang="ssa">Naath</span></p>

In order to display the English text in blue and the Nuer text in green, 
the following rules could be used:

:lang(en-AU) {font-style: normal;}
:lang(ssa) {font-style: italic;}

The selector :lang(en-AU) will only match elements that have a language 
value of  “en-AU” or have inherited that language value. If the css rule 
specified :lang(en-US), the rule would not match our sample paragraph.

Alternatively, we could make the language designation more general and 
use the following rules:

:lang(en) {font-style: normal;}
:lang(ssa) {font-style: italic;}

The rule for :lang(en) would math elements with a language value of 
“en”. It would also amtch more specific language specifications such as 
en-US and en-NZ.

<h3>A selector that matches the value of an attribute</h3>
Opera

The second method of specifying rules is to use attribute selectors. If 
I markup “Yeŋö loi rot Aboja” as

<p xml:lang="din" lang="din">Yeŋö loi rot Aboja</p>

I could write a rule matching the language attribute.

*[lang="din"] {font-family: "Doulos SIL";} This rule will match all 
elements that have a language attribute equal to “din”.

If the XHTML/HTML markup was

<div xml:lang="en" lang="en">
<p>The first line was <span xml:lang="din" lang="din">Yeŋö loi rot 
Aboja</span></p>
</div>

and I had two rules

p[lang="en"] {font-style: normal;}
*[lang="din"] {font-family: "Doulos SIL"; font-style: italic;}

Only the second rule would match. The paragraph has no language 
attribute to match.

<h3>A selector that matches the beginning of a value of an attribute</h3>
Opera

There is a significant difference between [lang="en"] and [lang|="en"]. 
The first language selector will only match elements with a language 
attribute equal to “en”, while the second selector will match any 
element with a language attribute starting with “en”. Therefore the 
second selector would match “en-US”, ”en-HK” or ”en-CA”.

Using an earlier example:

<div xml:lang="en-NZ" lang="en-NZ" dir="ltr">
 <p>It is polite to welcome people in their own language:</p>
 <ul>
  <li xml:lang="zh-CN" lang="zh-CN">欢迎</li>
  <li xml:lang="zh-TW" lang="zh-CN">歡迎</li>
  <li xml:lang="el" lang="el">Καλοσωρίσατε</li>
  <li xml:lang="ar" lang="ar" dir="rtl">اهلا وسهلا</li>
  <li xml:lang="ru" lang="ru">Добро пожаловать</li>
 </ul>
</div>

The style sheet could be written as:

 li {list-style-type: none;  line-height: 1.5em;}
 *[lang|="en"] {font-family: "Times New Roman";}
 *[lang|="el"] {font-family: "Times New Roman";}
 *[lang|="ru"] {font-family: "Times New Roman";}
 *[lang|="ar"] {font-family: "Simplified Arabic"; font-size: 1.2em;}
 li[lang|="ar"] {text-align: left;}
 *[lang="zh-TW"] {font-family: PMingLiU,MingLiU;}
 *[lang="zh-CN"] {font-family: SimSum-18030;SimHei;}

The selectors for Chinese use specific values, and will only match to 
the indicated values, while the other language attribute selectors are 
more generic. For instance [lang|="en"] will sucessfully match “en-NZ”.

It is important to note that not all web browsers can use language 
selectors and it is best to use more generic selectors in your CSS rules.


By the way

I have used the ISO-639-2 language code “ssa” for Nuer. Nuer doesn't 
have a unique ISO-639-2 language code. This is the code for a group of 
languages: Nilo-Saharan (Other). Seven Nilo-Saharan languages have 
unique ISO-639-2 codes, while approximately 178 languages share the 
generic “ssa” language code. The Ethnologue lists the languages at 
http://www.ethnologue.com/show_iso639.asp?code=ssa.

I have used the language codes “zh-TW” and “zh-CN”. These language codes 
do not represent specific languages. “zh-TW” would indicate Chinese 
spoken in Taiwan, although there are more than one Chinese language 
spoken in Taiwan. Similarly “zh-CN” represents Chinese spoken in China 
(PRC). This could refer to Mandarin or any other Chinese language.

More often ”zh-CN” and “zh-TW” are used to indicate Chinese languages 
written in the simplified script (“zh-CN”) or the traditional script 
(“zh-TW”).

Some of the modern web browsers will use the presence of the language 
tags “zh-CN” and “zh-TW” to select the default fonts to display the text 
when the web page designer does not indicate a font.

If you need to use language tags to differentiate between Chinese 
languages, the IANA language code registry has more precise langauge 
codes for a range of Chinese languages.

Useful links
The language pseudo-class: :lang
http://www.w3.org/TR/REC-CSS2/selector.html#lang

Attribute selectors
http://www.w3.org/TR/REC-CSS2/selector.html#attribute-selectors

Class selectors
http://www.w3.org/TR/REC-CSS2/selector.html#class-html

ID selectors
http://www.w3.org/TR/REC-CSS2/selector.html#id-selectors


-- 
Andrew Cunningham
Multilingual Technical Officer
Online Projects Team, Vicnet
State Library of Victoria
328 Swanston Street
Melbourne  VIC  3000
Australia

andrewc@vicnet.net.au

Ph. +61-3-8664-7430
Fax: +61-3-9639-2175

http://www.openroad.net.au/
http://www.libraries.vic.gov.au/
http://www.vicnet.net.au/

Received on Wednesday, 6 August 2003 00:49:10 UTC