Feedback on "Best Practice 11: Specify the language of the content"

Hi Yves, all,

Since I wasn't able to attend the call, here are some observations ...

Hopefully, the level of granularity is alright.

Cheers,

Christian

===

Best Practice 13: Specify the language of the content <http://www.w3.org/International/its/techniques/its-techniques.html#AuthLang> 

Make sure to indicate in what language a document is.
 
CL-RephraseProposal> For any element or attribute in your content, make sure to indicate the language.
CL-Rationale> We want people not just to indicate the overall "document language".

The normal way of specifying the language of a document is to declare it at the root element and, if needed, to override that initial declaration for parts of the document in a different language.

CL-RephraseProposal> When using the recommended xml:lang for identifying language, the easiest way ... The inheritance rules of xml:lang take care of the rest.
CL-Rationale> This once again puts stress on "xml:lang", and mentions that it is related to value proliferation/inheritance.

Example 14: 

In this example, the main content of the document is in English, while a short citation is identified as being in French Canadian.

<document xml:lang="en">
 <para>The motto of Québec is the short phrase:
   <q xml:lang="fr-ca">Je me souviens</q>. It is chiseled on 
   the front of the Parliament Building.</para>
</document>

Your format should provide the xml:lang <http://www.w3.org/TR/REC-xml/#sec-lang-tag>  attribute for this purpose. See Best Practice 1: Provide xml:lang to specify natural language content <http://www.w3.org/International/its/techniques/its-techniques.html#DevLang>  for more information.

Having information about what is the language of the content is very important in many situations. Some of them are:

CL-Note> I wonder if we should mention that in the world of Web 2.0 with reuse, and repurposing of content (think of RSS) you very often don't know in advance to which use your content will be put. It quite likely that all of the processes mentioned below will become important in the long run.

CL-Note> I would favour to mention a technology (such as CSS-mechanisms for text wrapping) to give an example for all of the items below

*	selection of a proper font (e.g. for traditional or simplified Chinese)

*	processing of the text for wrapping and hyphenation

*	spell-checking the text

*	selecting proper formatting properties for data such as date, numbers, etc.

*	selecting proper automated text such as quotation marks or other punctuation signs

Resources:

Reference links


*	The values to use with xml:lang to specify a language.
	http://www.ietf.org/rfc/rfc4646.txt 
*	Description of the language identification mechanism in the XML specification.
	http://www.w3.org/TR/REC-xml/#sec-lang-tag 
*	Internationalization FAQ: xml:lang in XML document schemas.
	http://www.w3.org/International/questions/qa-when-xmllang


More resources

Technique index <http://www.w3.org/International/technique-index>  - Topic index <http://www.w3.org/International/resource-index> 
 
CL-Note> Hmm, should we have these generic pointers. Shouldn't we at least say where they are going?
 
=========================================================================================================
Christian Lieske
MultiLingual Technology Solutions (MLT)
SAP Language Services (SLS)
SAP Globalization Services
SAP AG
Dietmar-Hopp-Allee 16
D-69190 Walldorf
Germany
T   +49 (62 27) 7 - 6 13 03
F   +49 (62 27) 7 - 2 54 18
christian.lieske@sap.com <blocked::mailto:christian.lieske@sap.com> 
http://www.sap.com <blocked::http://www.sap.com/> 

Sitz der Gesellschaft/Registered Office: Walldorf, Germany
Vorstand/SAP Executive Board: Henning Kagermann (Sprecher/CEO), Shai Agassi, Léo Apotheker, Werner Brandt, Claus Heinrich, Gerhard Oswald, Peter Zencke
Vorsitzender des Aufsichtsrats/Chairperson of the SAP Supervisory Board: Hasso Plattner
Registergericht/Commercial Register Mannheim No HRB 350269

Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

This e-mail may contain trade secrets or privileged, undisclosed, or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying, or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation. 

Received on Tuesday, 6 March 2007 14:58:02 UTC