- From: CE Whitehead <cewcathar@hotmail.com>
- Date: Tue, 13 Mar 2007 14:23:11 -0500
- To: ishida@w3.org, www-international@w3.org
+1 --C. E. Whitehead cewcathar@hotmail.com > > >This is an attempt to summarise and move forward some ideas in a thread on >www-international@w3.org by Christophe Strobbe, Martin Duerst, Bjoern >Hoermann and Tex Texin. >http://lists.w3.org/Archives/Public/www-international/2005JulSep/0163.html > > > >You should always use the lang and/or xml:lang attributes in HTML or XHTML >to identify the human language of the content so that applications such as >voice browsers, style sheets, and the like can process that text. (See >Declaring Language in XHTML and HTML[1] for the details.) > >You can override that language setting for a part of the document that is >in a different language, eg. some French quotation in an English document, >by using the same attribute(s) around the relevant bit of text. > >Suppose you have some text that is not in any language, such as type >samples, part numbers, perhaps program code. How would you say that this >was no language in particular? > >There are a number of possible approaches: > > 1. A few years ago we introduced into the XML spec the idea that >xml:lang=”" conveys that ‘there is no language information >available’. (See 2.12 Language Identification[2]) > > 2. An alternative is to use the value ‘und’, for >‘undetermined’. > > 3. In the IANA Subtag Registry[3] there is another tag, ‘zxx’, that >means ‘No linguistic content’. Perhaps this is a better choice. It has >my vote at the moment. +1 > > > >[xml:lang=""] >Is ‘no language information available’ suitable to express ‘this is >not a language’? My feeling is not. > >If it were appropriate, there are some other questions to be answered here. >With HTML an empty string value for the lang or xml:lang attribute produces >a validation error. > >It seems to me that the validator should not produce an error for >xml:lang=”". It needs to be fixed. > >I’m not clear whether the HTML DTD supports an empty string value for >lang. If so, the presumably the validator needs to be fixed. If not, then >this is not a viable option, since you’d really want both lang and >xml:lang to have the same values. > >[und] >Would the description ‘undetermined’ fit this case, given that it is >not a language at all? Again, it doesn’t seem right to me, since >‘undetermined’ seems to suggest that it is a language of some sort, but >we’re not sure which. > >[zxx] >This seems to be the right choice for me. It would produce no validation >issues. The only issue is perhaps that it’s not terrible memorable. > >Thoughts? > >RI > > >[1] http://www.w3.org/International/tutorials/language-decl/ > >[2] http://www.w3.org/TR/REC-xml/#sec-lang-tag > >[3] http://www.iana.org/assignments/language-subtag-registry > >============ >Richard Ishida >Internationalization Lead >W3C (World Wide Web Consortium) > >http://www.w3.org/People/Ishida/ >http://www.w3.org/International/ >http://people.w3.org/rishida/blog/ >http://www.flickr.com/photos/ishida/ > > >-- >No virus found in this outgoing message. >Checked by AVG Free Edition. >Version: 7.5.446 / Virus Database: 268.18.10/720 - Release Date: 12/03/2007 >19:19 > > > _________________________________________________________________ Rates near 39yr lows! $430K Loan for $1,399/mo - Paying Too Much? Calculate new payment http://www.lowermybills.com/lre/index.jsp?sourceid=lmb-9632-18226&moid=7581
Received on Tuesday, 13 March 2007 19:23:35 UTC