W3C home > Mailing lists > Public > www-international@w3.org > January to March 2007

Re: Comment on working draft "Specifying Language in XHTML and HTML Content"

From: Addison Phillips <addison@yahoo-inc.com>
Date: Fri, 16 Mar 2007 13:10:10 -0700
Message-ID: <45FAF9A2.20208@yahoo-inc.com>
To: CE Whitehead <cewcathar@hotmail.com>
CC: ishida@w3.org, www-international@w3.org

Character set should not be recommended as a proxy for language. 
Specifying the character encoding of a document is important, but, at 
the very best, merely *implies* the language of the document (my mailer 
might send you this message in Big5, but that doesn't make the message 
Chinese). It should not be confused with identifying the language of the 
content, and, in part this is what the techniques document is trying to 

You can, absolutely, declare the character set (encoding) without saying 
anything about the language of a document. The headers and meta 
information used for language and for encoding are entirely separate. In 
fact, far more documents on the Web have an encoding declaration of one 
form or another than have language declarations.


Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.

CE Whitehead wrote:
> "Internationalization Best Practices"
> (http://www.w3.org/TR/2005/WD-i18n-html-tech-lang-20050224/)
> (I finally got to read through the rewritten sections; I like that the 
> name "Best Practices" has been shortened to "Techniques" also I think 
> the rewriting makes some of the early sections 'more accessible'.  Below 
> are all the remaining comments I have--mostly some more thought about 
> mul and character sets; also a place where you left off mentioning the 
> xml: lang tag though maybe you though it was implicit)
> {Section 3.1 par 4
> Below par 4 you might discuss mul for audience language only (but I 
> would never recommend it for text-processing language!--see note below:}
> "There are also pages where the navigational information, including the 
> page title, is in one language but the real content of the page is in 
> another. While this is not necessarily good practice, it doesn't change 
> the fact that the language of the intended audience is usually that of 
> the content, regardless of the language at the top of the document source."
> {ADD ??
> "A case where the audience and text processing languages differ slightly 
> is an online foreign language lesson (immersion), written in a single 
> language but aimed at speakers of multiple languages;
> for example, the text-processing language might be fr [or en, suit 
> yourself]
> while the audience language might be declared as mul (multiple); or mul, 
> fr (since presumably the audience speaks some French)."}
> {Then instead of recommending mul for text processing, maybe recommend 
> declaring the character set IF IT'S POSSIBLE without mul; you might need 
> to declare a language tag with mul
> --I goofed in rejecting mul--
> at the end of :
> Section 5 Best Practice 2 "Discussion" par 3???}
> "Although we would normally recommended to declare the default 
> text-processing language in the html tag, since only one language can be 
> defined at a time when using attributes, there may appear to be little 
> point in doing so if a document has separate content to support 
> multilingual audiences. It may be more appropriate to begin labeling the 
> language on lower level elements, where the actual text is in one 
> language or another."
> {add?? and to just specify the character set???
> {I'm new to some of this,
> you can declare the character set in the document type declaration
> and you can declare it in the meta tags
> can you declare it without a language tag such as en or mul???
> SORRY!!}
> {Another issue,
> Best Practice 3 "How to" par 2/3 --
> What about the xml: lang attribute??
> it can be used on all HTML elements too; is also used on XML elements??}
> "The lang attribute can be used on all HTML elements . . . "
>> "The lang and xml:lang attributes can be used on all . . . "
> --C. E. Whitehead
> cewcathar@hotmail.com
> _________________________________________________________________
> Live Search Maps � find all the local information you need, right when 
> you need it. http://maps.live.com/?icid=hmtag2&FORM=MGAC01
Received on Friday, 16 March 2007 20:10:22 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:53 UTC