W3C home > Mailing lists > Public > www-international@w3.org > January to March 2007

Re: Comment on working draft "Specifying Language in XHTML and HTML Content"

From: CE Whitehead <cewcathar@hotmail.com>
Date: Sat, 17 Mar 2007 14:16:51 -0400
Message-ID: <BAY114-F116D75C61C64EED5D24B02B3700@phx.gbl>
To: addison@yahoo-inc.com
Cc: ishida@w3.org, www-international@w3.org


Hi all, Dr. Ishida,
& Addison, hi,
thanks for your information!

I do know how to declare the character set in the meta tag, but am wondering 
about how to declare it--without specifying any language--in the html tag;
this would be for declaring the character set used in a document with 
content in multiple languages.

So could "Internationalization Best Practices" recommend {remind folks to} 
declare the character set {IF IT'S POSSIBLE without mul???}
whenever one cannot declare a single text processing language,
but where there is still one character set used for encoding text for all 
the languages????
(This would be the situation for much of Europe; actually it would work for 
some other language script combinations too.)

I said:
(>>you might need to declare a language tag (with mul)
and then the character encoding
>>at the end of :
>>
>>Section 5 Best Practice 2 "Discussion" par 3???}
>>
>>"Although we would normally recommended to declare the default 
>>text-processing language in the html tag, since only one language can be 
>>defined at a time when using attributes, there may appear to be little 
>>point in doing so if a document has separate content to support 
>>multilingual audiences. It may be more appropriate to begin labeling the 
>>language on lower level elements, where the actual text is in one language 
>>or another."
>>{add?? and to just specify the character set???})

--C. E. Whitehead
cewcathar@hotmail.com

* * *

>Character set should not be recommended as a proxy for language. Specifying 
>the character encoding of a document is important, but, at the very best, 
>merely *implies* the language of the document (my mailer might send you 
>this message in Big5, but that doesn't make the message Chinese). It should 
>not be confused with identifying the language of the content, and, in part 
>this is what the techniques document is trying to convey.
>
>You can, absolutely, declare the character set (encoding) without saying 
>anything about the language of a document. The headers and meta information 
>used for language and for encoding are entirely separate. In fact, far more 
>documents on the Web have an encoding declaration of one form or another 
>than have language declarations.
>
>Addison
>
>--
>Addison Phillips
>Globalization Architect -- Yahoo! Inc.
>
>Internationalization is an architecture.
>It is not a feature.
>
>CE Whitehead wrote:
>>
>>
>>"Internationalization Best Practices"
>>(http://www.w3.org/TR/2005/WD-i18n-html-tech-lang-20050224/)
>>
>>(I finally got to read through the rewritten sections; I like that the 
>>name "Best Practices" has been shortened to "Techniques" also I think the 
>>rewriting makes some of the early sections 'more accessible'.  Below are 
>>all the remaining comments I have--mostly some more thought about mul and 
>>character sets; also a place where you left off mentioning the xml: lang 
>>tag though maybe you though it was implicit)
>>
>>{Section 3.1 par 4
>>
>>Below par 4 you might discuss mul for audience language only (but I would 
>>never recommend it for text-processing language!--see note below:}
>>
>>"There are also pages where the navigational information, including the 
>>page title, is in one language but the real content of the page is in 
>>another. While this is not necessarily good practice, it doesn't change 
>>the fact that the language of the intended audience is usually that of the 
>>content, regardless of the language at the top of the document source."
>>
>>{ADD ??
>>"A case where the audience and text processing languages differ slightly 
>>is an online foreign language lesson (immersion), written in a single 
>>language but aimed at speakers of multiple languages;
>>for example, the text-processing language might be fr [or en, suit 
>>yourself]
>>while the audience language might be declared as mul (multiple); or mul, 
>>fr (since presumably the audience speaks some French)."}
>>
>>
>>{Then instead of recommending mul for text processing, maybe recommend 
>>declaring the character set IF IT'S POSSIBLE without mul;

when you cannot declare a single text processing language
>>you might need to declare a language tag with mul
>>--I goofed in rejecting mul--
>>at the end of :
>>
>>Section 5 Best Practice 2 "Discussion" par 3???}
>>
>>"Although we would normally recommended to declare the default 
>>text-processing language in the html tag, since only one language can be 
>>defined at a time when using attributes, there may appear to be little 
>>point in doing so if a document has separate content to support 
>>multilingual audiences. It may be more appropriate to begin labeling the 
>>language on lower level elements, where the actual text is in one language 
>>or another."
>>{add?? and to just specify the character set???
>>{I'm new to some of this,
>>you can declare the character set in the document type declaration
>>and you can declare it in the meta tags
>>can you declare it without a language tag such as en or mul???
>>SORRY!!}
>>
>>{Another issue,
>>Best Practice 3 "How to" par 2/3 --
>>What about the xml: lang attribute??
>>it can be used on all HTML elements too; is also used on XML elements??}
>>
>>"The lang attribute can be used on all HTML elements . . . "
>>
>>>"The lang and xml:lang attributes can be used on all . . . "
>>
>>--C. E. Whitehead
>>cewcathar@hotmail.com
>>
>>_________________________________________________________________
>>Live Search Maps � find all the local information you need, right when 
>>you need it. http://maps.live.com/?icid=hmtag2&FORM=MGAC01
>>
>>
>
>

_________________________________________________________________
Exercise your brain! Try Flexicon. 
http://games.msn.com/en/flexicon/default.htm?icid=flexicon_hmemailtaglinemarch07
Received on Saturday, 17 March 2007 18:17:04 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:09 GMT