W3C home > Mailing lists > Public > www-international@w3.org > October to December 2004

RE: declaring language in html/xhtml

From: Richard Ishida <ishida@w3.org>
Date: Tue, 14 Dec 2004 11:01:57 -0000
To: "'Jon Hanna'" <jon@hackcraft.net>, "'Alan Pierce'" <apierce411@hotmail.com>, <www-international@w3.org>
Message-Id: <20041214110153.B8A224F635@homer.w3.org>

See our explanation at
http://www.w3.org/International/geo/html-tech/tech-lang.html#ri20030218.1311
40352 (and please let me know if this is not clear enough).

RI

============
Richard Ishida
W3C

contact info:
http://www.w3.org/People/Ishida/ 

W3C Internationalization:
http://www.w3.org/International/ 

Publication blog:
http://people.w3.org/rishida/blog/
 
 

> -----Original Message-----
> From: www-international-request@w3.org 
> [mailto:www-international-request@w3.org] On Behalf Of Jon Hanna
> Sent: 11 December 2004 11:01
> To: 'Alan Pierce'; www-international@w3.org
> Subject: RE: declaring language in html/xhtml
> 
> 
> > Does it make any practical difference to serve html with 
> the html tag 
> > marked-up as xhtml, like:
> > <html lang="ja-JP" xml:lang="ja_JP"  
> > xmlns="http://www.w3.org/1999/xhtml">
> > 
> > as opposed to simply
> > <html lang="ja-JP"> ?
> 
> There's a few things here.
> 
> 1. ja-JP means the dialect of Japanese spoken in Japan as 
> opposed to the 1 or more dialects spoken elsewhere. I've been 
> told that there isn't any other country with a different form 
> of Japanese, so the correct language tag is just "ja" unlike, 
> for example British English "en-GB" which does benefit from 
> the second part of the tag as it differentiates it from 
> en-IE, en-US etc. (I don't know much about Japanese, but I've 
> seen ja-JP used as an example of just this sort of mistake by 
> those who do know more than I).
> 
> 2. ja_JP is incorrect syntax, both lang and xml:lang take RFC 
> 3066 tags so there are no underscores (a typo?).
> 
> 3. The lang attribute is only in XHTML for backwards 
> compatibility, so that when an old HTML tool that doesn't 
> grok XHTML sees the XHTML it will act as if it is HTML and be 
> able to determine the language. Contra this general-purpose 
> XML tools that don't know anything specific about XHTML (and 
> the ability to use such tools is the main practical advantage 
> in using XHTML rather than HTML) will understand the 
> xml:lang, but not the lang. As such xml:lang is the one that 
> you must use, lang is the one that you can use as well. 
> 
> <html lang="ja">
> <!-- HTML 4.01 or earlier, Japanese -->
> 
> <blah xml:lang="ja">
> <!-- Some form of XML, Japanese -->
> 
> <html xml:lang="ja">
> <!-- Some form of XML, Japanese (Not XHTML, as there's no 
> namespace) -->
> 
> <html xml:lang="ja" xmlns="http://www.w3.org/1999/xhtml">
> <!-- XHTML, Japanese -->
> 
> <html lang="ja" xmlns="http://www.w3.org/1999/xhtml">
> <!-- XHTML, Japanese, but general XML tools won't realise this. -->
> 
> <html xml:lang="ja" lang="ja" xmlns="http://www.w3.org/1999/xhtml">
> <!-- XHTML, Japanese, backwards compatible with old HTML 
> user-agents -->
> 
> <html xml:lang="ja" lang="en" xmlns="http://www.w3.org/1999/xhtml">
> <!-- Obviously a bug, but the way it's interpreted is worth 
> looking at.
> An XML tool will see it as Japanese.
> An HTML tool will see it as English.
> An XHTML tool will see xml:lang as over-riding lang, since 
> lang is just for backwards-compatibility, and hence see it as 
> being Japanese -->
> 
> In all I'd recommend you keep using the fuller form until the 
> general level of tool support means you can drop lang and 
> just use xml:lang.
> 
> Regards,
> Jon Hanna
> Work: <http://www.selkieweb.com/>
> Play: <http://www.hackcraft.net/>
> Chat: <irc://irc.freenode.net/selkie> 
> 
> 
Received on Tuesday, 14 December 2004 11:02:02 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:04 GMT