- From: Deborah Cawkwell <deborah.cawkwell@bbc.co.uk>
- Date: Fri, 18 Jun 2004 00:31:27 +0100
- To: <public-i18n-geo@w3.org>
QUESTION Why should I use the language attribute in web pages? ANWSWER The language attribute unambiguously specifies the 'natural language' of web page content. HTML, XHTML and XML vary in the way the language attribute is specified. See http://www.w3.org/International/O-HTML-tags.html link Applications exist that can use natural language metadata about content to deliver users the most relevent information based on the language preferences of end users. The more content that is tagged and tagged correctly, the more useful and pervasive such applications will become. Metadata that indicates content language can assist many audiences for a particular page or section of a page. For example, authoring tools can supply appropriate spelling and grammer checking based on the language of a segment. Translation tools can use the tags to help recognize sections of text in a particular language. Search engines can group or filter results based on the user's preferences. And user-agents can (and do) use the content language to select language-appropriate fonts, which improves the overall user experience of the page. The language attribute should always be used to indicate the primary language of the web page (in the main page container element). If the language changes within the main page container element this should be reflected in a sub container element, eg, span, div, td, p, etc. Applications Accessibility The 'lang' attribute assists speech synthesizers and Braille translators; it is required by the W3C Web Accessibility Initiative (WAI) and enforced governmental policies in some countries, eg, UK - Disability Discrimination Act (UK) Page rendering CSS2 uses the language attribute powerfully as a pseudo class (http://www.w3.org/International/questions/qa-css-lang.html). For example, you might want to use different font size depending on the language: <style type="text/css"> :lang(ar) { font-family: Traditional Arabic, serif; font-size: 125%; } :lang(fr) { font-family:arial; font-size: 100%; } </style> Currently, this does not work in Microsoft Internet Explorer. Search A common use for meta is to specify keywords that a search engine may use to improve the quality of search results. When several meta elements provide language-dependent information about a document, search engines may filter on the xml:lang attribute to display search results using the language preferences of the user. (http://www.w3.org/TR/2002/WD-xhtml2-20020805/mod-meta.html) XML The 'xml:lang' attribute is the standard way to identify language information in XML. [Information about tasks] cf Google Processing eg XSLT BY THE WAY You might think information about natural language could be inferred from the character encoding. However, character encoding does not enable unabiguous identification of a natural language: there *must* be a 1:1 mapping between encoding and language for this inference to work... and there isn't one. For example, a single character encoding could be used for many languages, eg, Latin 1 (iso-8859-1) could encode both French and English, as well as a great many other languages. In addition, the character encoding can vary over a single language, eg, Arabic could be encoded with 'windows-1256' or 'iso-8859-6' or 'utf-8' (or another Unicode encoding). USEFUL LINKS FAQ: HTTP and meta language information - http://www.w3.org/International/questions/qa-http-and-lang.html Character encoding - http://www.w3.org/TR/2004/WD-charmod-20040225/#sec-Digital XML 1.0 - http://www.w3.org/TR/REC-xml/#sec-lang-tag [Will check following - from previous] HTML 4.01 Specification W3C Recommendation 24 December 1999: http://www.w3.org/TR/html401/struct/dirlang.html#h-8.1.3. XHTML 2.0 W3C Working Draft 5 August 2002 http://www.w3.org/TR/2002/WD-xhtml2-20020805/mod-meta.html Web Accessbility Initiative: lang attribute - http://www.w3.org/TR/WCAG10/#gl-abbreviated-and-foreign Tutorial: Language markup in XHTML and CSS (DRAFT): http://www.w3.org/International/tutorials/tutorial-lang.html Authoring Techniques for XHTML & HTML Internationalization: Specifying the language of content 1.0 - http://www.w3.org/International/geo/html-tech/tech-lang.html FAQ: Styling using the lang attribute: http://www.w3.org/International/questions/qa-css-lang.html FAQ: Two-letter or three-letter language codes: http://www.w3.org/International/questions/qa-lang-2or3.html From the usability perspective: http://diveintoaccessibility.org/day_7_identifying_your_language.html An interesting view on Google usage across cultures: http://www.google.com/press/zeitgeist2003.html http://www.google.com/press/zeitgeist.html http://www.bbc.co.uk/ - World Wide Wonderland This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this.
Received on Thursday, 17 June 2004 19:31:31 UTC