W3C home > Mailing lists > Public > www-international@w3.org > July to September 2011

Re: HTML <head> article updated

From: Richard Ishida <ishida@w3.org>
Date: Wed, 10 Aug 2011 13:31:36 +0100
Message-ID: <4E427A28.7010205@w3.org>
To: Chris Mills <cmills@opera.com>
CC: "'public-evangelist@w3.org' w3. org" <public-evangelist@w3.org>, www International <www-international@w3.org>
Hello Chris,

[cc www-international so that they know i have sent feedback, and in 
case others wish to comment]

Here's some feedback on http://www.w3.org/wiki/The_HTML_head_element

"The language codes may be two-letter codes, such as en for English, 
four-letter codes such as en-US for American English, or other, less 
common, codes. The two-letter codes are defined in ISO 639-1, although 
modern best practice dictates that you should use the IANA subtag 
registry for your language code definitions."

I think this paragraph needs a fair bit of attention.

[1] language codes => language tags  (for consistency and clarity - 
codes was used in the past to refer to ISO language codes or region 
codes, but something like en-US is two such codes (though only one 
language tag)).  (btw, en and US are both 'subtags' - be careful not to 
mix tags with subtags)

[2] language subtags can be 2 or 3 letters, region subtags can be 2 or 3 
alphanum characters, so the opening part of the paragraph is quite 
misleading.

[3] i strongly urge to not refer people to ISO 639 - they should use the 
IANA registry to look things up (and you may want to point to 
http://rishida.net/utils/subtags/ which makes lookup a little more user 
friendly).

[4] 'modern best practice': well actually its in the standards, so it's 
a little more than best practice

[5] it may be better for this audience to link to 
http://www.w3.org/International/questions/qa-choosing-language-tags 
rather than  http://www.w3.org/International/articles/language-tags/



"Don't worry too much about this for now. utf-8 is the universal 
character set, which includes pretty much any character that you might 
want to use on a web page, from any common human language, so it is a 
good idea to declare this to make sure you HTML has full international 
capabilities. In addition, you can avoid a serious Internet Explorer 
security risk by declaring it in the first 512 bytes of the page. So 
just below the <head> tag is fine. This is what all the below examples 
will do."

[6] actually they need to worry about it at least enough to ensure that 
they are actually *saving their document* as UTF-8, not just changing 
the encoding declaration - otherwise, a doc saved as iso-8859-1 for 
example will fail to display properly when it comes to accented 
characters. They also need to be aware that the server may be overriding 
their declaration.

I recommend that you step back a little in the wiki, add a brief 
description of what an encoding is and why it's important, and add some 
text to say that authors should ensure that their editor *saves the 
text* in utf-8, but, if not, they should ensure that the charset 
attribute should indicate what the actual encoding used is.  We have 
some articles that can help people understand these concepts at
http://www.w3.org/International/questions/qa-choosing-encodings
http://www.w3.org/International/questions/qa-changing-encoding
http://www.w3.org/International/questions/qa-setting-encoding-in-applications

Hope that helps,
RI






On 04/08/2011 16:56, Chris Mills wrote:
> UPDATE - 4th August 2011: I've updated http://www.w3.org/wiki/The_HTML_head_element to clean up language, add new HTML5 features, and add in a new section about doctypes, to replace Choosing the right doctype for your HTML documents (http://www.w3.org/wiki/Choosing_the_right_doctype_for_your_HTML_documents). The original article was a bit long winded, and needed a lot of updates to account for new thinking about doctypes, HTML5 doctype, etc.
>
> this is ready for proofing/translation now.
>
> QUESTION - should this big new doctype section be put into a new article? Does it make the article a bit too long?
>
>
>
> --
>
> Chris Mills
> Open standards evangelist and dev.opera.com editor
> Opera Software
>
> * Try our browsers: http://www.opera.com
> * Learn to build a better web, with the Opera web standards curriculum: http://www.opera.com/wsc
> * Learn about the latest open standards technologies and techniques: http://dev.opera.com
>
>
>
>

-- 
Richard Ishida
Internationalization Activity Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/


Register for the W3C MultilingualWeb Workshop!
Limerick, 21-22 September 2011
http://multilingualweb.eu/register
Received on Wednesday, 10 August 2011 12:31:59 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 10 August 2011 12:32:00 GMT