W3C home > Mailing lists > Public > www-international@w3.org > January to March 2007

(wrong string) ‘this is not in any language’ in XHTML/HTML

From: CE Whitehead <cewcathar@hotmail.com>
Date: Tue, 13 Mar 2007 14:23:11 -0500
Message-ID: <BAY114-F15CCAFAB8171A9044F7D7DB37C0@phx.gbl>
To: ishida@w3.org, www-international@w3.org

+1

--C. E. Whitehead
cewcathar@hotmail.com
>
>
>This is an attempt to summarise and move forward some ideas in a thread on 
>www-international@w3.org by Christophe Strobbe, Martin Duerst, Bjoern 
>Hoermann and Tex Texin.
>http://lists.w3.org/Archives/Public/www-international/2005JulSep/0163.html
>
>
>
>You should always use the lang and/or xml:lang attributes in HTML or XHTML 
>to identify the human language of the content so that applications such as 
>voice browsers, style sheets, and the like can process that text. (See 
>Declaring Language in XHTML and HTML[1] for the details.)
>
>You can override that language setting for a part of the document that is 
>in a different language, eg. some French quotation in an English document, 
>by using the same attribute(s) around the relevant bit of text.
>
>Suppose you have some text that is not in any language, such as type 
>samples, part numbers, perhaps program code. How would you say that this 
>was no language in particular?
>
>There are a number of possible approaches:
>
>    1. A few years ago we introduced into the XML spec the idea that 
>xml:lang=”" conveys that ‘there is no language information 
>available’. (See 2.12 Language Identification[2])
>
>    2. An alternative is to use the value ‘und’, for 
>‘undetermined’.
>
>    3. In the IANA Subtag Registry[3] there is another tag, ‘zxx’, that 
>means ‘No linguistic content’. Perhaps this is a better choice. It has 
>my vote at the moment.

+1
>
>
>
>[xml:lang=""]
>Is ‘no language information available’ suitable to express ‘this is 
>not a language’? My feeling is not.
>
>If it were appropriate, there are some other questions to be answered here. 
>With HTML an empty string value for the lang or xml:lang attribute produces 
>a validation error.
>
>It seems to me that the validator should not produce an error for 
>xml:lang=”". It needs to be fixed.
>
>I’m not clear whether the HTML DTD supports an empty string value for 
>lang. If so, the presumably the validator needs to be fixed. If not, then 
>this is not a viable option, since you’d really want both lang and 
>xml:lang to have the same values.
>
>[und]
>Would the description ‘undetermined’ fit this case, given that it is 
>not a language at all? Again, it doesn’t seem right to me, since 
>‘undetermined’ seems to suggest that it is a language of some sort, but 
>we’re not sure which.
>
>[zxx]
>This seems to be the right choice for me. It would produce no validation 
>issues. The only issue is perhaps that it’s not terrible memorable.
>
>Thoughts?
>
>RI
>
>
>[1] http://www.w3.org/International/tutorials/language-decl/
>
>[2] http://www.w3.org/TR/REC-xml/#sec-lang-tag
>
>[3] http://www.iana.org/assignments/language-subtag-registry
>
>============
>Richard Ishida
>Internationalization Lead
>W3C (World Wide Web Consortium)
>
>http://www.w3.org/People/Ishida/
>http://www.w3.org/International/
>http://people.w3.org/rishida/blog/
>http://www.flickr.com/photos/ishida/
>
>
>--
>No virus found in this outgoing message.
>Checked by AVG Free Edition.
>Version: 7.5.446 / Virus Database: 268.18.10/720 - Release Date: 12/03/2007 
>19:19
>
>
>

_________________________________________________________________
Rates near 39yr lows!  $430K Loan for $1,399/mo - Paying Too Much? Calculate 
new payment 
http://www.lowermybills.com/lre/index.jsp?sourceid=lmb-9632-18226&moid=7581
Received on Tuesday, 13 March 2007 19:23:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:09 GMT