Possible Solutions for "Indicator of Translatability"

Requirement Information:

Wiki page: http://esw.w3.org/topic/its0505Translatability
Requirement document: http://www.w3.org/International/its/requirements/Overview.html#transinfo

1. Overview

There are two main aspects in indicating translatable parts of an XML document.

An item is translatable - This means, from a format viewpoint, that the item contains text that is normally translatable. This is a property expected to be generally valid for all occurrences of the given type of item. For example: the <p> element of HTML is translatable.
An item is "to be translated" - This means a translatable item is actually not flagged as do-not-translate. This case is relevant only at the document instance level, when specifying an exception to the general rule. For example: a given <p> element in one specific HTML file is not to be translated.

Indicators of translatability can be defined using two main methods:

Using the existing markup of the original format - Elements in the original format may correspond to the same function as the ITS indicator. they can be mapped the relevant ITS function. The mapping mechanism between original markup and the ITS functions is specified here: [<Ref to mapping mechanism>].
Direct definition - The original format does not provide markup corresponding to the needed ITS functions: when possible, the ITS namespace is used.

2. Solution at the Schema Level

2.1. XML Schema

When XML Schema is used, the indicator can be set

<xs:schema version="1.0" xml:lang="en"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 targetNamespace="http://www.w3.org/1999/xhtml"
 xmlns="The-original-format"
 xmlns:itsdef="The-ITS-definition-namespace"
 xmlns:xml="http://www.w3.org/XML/1998/namespace"
 elementFormDefault="qualified">
...
<xs:element name="para" itsdef:translatable="yes">
 <xs:complexType mixed="true">
  <xs:complexContent>
   <xs:extension base="Inline">
    <xs:attributeGroup ref="attrs"/>
   </xs:extension>
  </xs:complexContent>
 </xs:complexType>
</xs:element>
...
</xs:schema>

The same indicator can be used for attributes. However, be aware that using translatable attribute is not recommended as it causes various problems for localization and in several languages. See [<Ref to Attributes section>] for more information.

2.2. Relax-NG

The same namespace can be used in Relax-NG (XML form):

<grammar xmlns="http://relaxng.org/ns/structure/1.0"
 xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
 datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
 xmlns:itsdef="The-ITS-definition-namespace">
...
<define name="title">
 <element name="title" itsdef:translatable="yes">
  <ref name="title.attlist" /> 
  <zeroOrMore>
   <ref name="title.char.mix" /> 
  </zeroOrMore>
 </element>
</define>
...
</grammar>

3. Solution at the Document Instance Level

3.1. For the whole document

Some documents may not be associated with a formal schema (XML Schema, DTD, etc.), or may be associated with schemas that have no localization information and on which the content developers of the documents have no control. In such cases, while this not the preferred way, it is important to allow a definition of the localizable parts in a generic way, rather than flag each and every instance.

Such definition would be stated at the beginning of the document.

<doc xmlns:its="The-ITS-instance-namespace">
<its:globalProperties>
 <its:defaultForElements translatable="no"/>
 <its:defaultForAttributes translatable="no"/>
 <its:rule source="\\para" translatable="yes"/>
 <its:rule source="@title" translatable="yes"/>
</its:globalProperties>
<item type="2">
 <name>ItemABC</name>
 <file>data.img</file>
 <fileref>9E5DDF49-0876-4b59-965B-38336DC4C366</fileref>
</item>
<item type="1">
 <name>Item1</name>
 <data title="This is translatable">
  <para>This is to translate.</para>
 </data>
</item>
...
</doc>

An alternative, and more modular, way of specifying top-level definitions is to use a pointer to the definitions.

<doc xmlns:its="The-ITS-instance-namespace"
 its:fileProperties="DocITSProperties.xml">
<item type="2">
 <name>ItemABC</name>
 <file>data.img</file>
 <fileref>9E5DDF49-0876-4b59-965B-38336DC4C366</fileref>
</item>
<item type="1">
 <name>Item1</name>
 <data title="This is translatable">
  <para>This is to translate.</para>
 </data>
</item>
...
</doc>

This second method has several advantages over embedding the definitions within the original document itself:

The definitions can be re-used for all files in the original format.
The ITS instruction, a simple extra attribute does not impair the structure, and therefore the processing of the original files.

3.2. For specific parts of the document

Whether the general specification of what is translatable is done from a schema, or from a block of information at the top of the document, there are cases within the document that may be exceptions to the general specifications. One can imagine for example a simple XML format allowing to mix UI text and programming strings within the same document. The general localization properties would specify that a given element is translatable, but within some files, some occurrences of the element will contain non-translatable data.

3.2.1. Elements

Translatability of an element content can be set using an ITS attribute if no equivalent attribute exists in the original format.

<doc xmlns="The-original-namespace"
 xmlns:its="The-ITS-instance-namespace">
...
<para its:translate="no">Some text content
that is not translatable.</para>
...
</doc>

3.2.2. Part of an element content

For mixed content elements, when a portion of the text needs to be labeled for translatability, one can use an existing span-like element, or the one provided by the ITS namespace.

<para>Some text content that is <inline-format its:translate="no">not translatable</inline-format>.</para>

<para>Some text content that is <its:span translate="no">not translatable</its:span>.</para>

3.3.3. Attributes

Dealing with meta-data for attribute is more difficult as the information cannot be attached to the attribute itself, like for an element. The solution is to use an ITS attribute enumerating the attributes for a given localization property:

<doc xmlns="The-original-namespace"
 xmlns:its="The-ITS-instance-namespace">
...
<para id="102" title="To translate"
 index="Not to translate"
 comment="Not to translate"
 its:notTranslatableAttributes="comment index">
 its:translatableAttributes="title">Text to localize.</para>
...
</doc>

Note that ITS-instance level markup can be used in XML Schema or the XML syntax of Relax-NG as well, since both formats are XML documents themselves:

<xs:schema version="1.0" xml:lang="en"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 targetNamespace="http://www.w3.org/1999/xhtml"
 xmlns="The-original-format"
 xmlns:itsdef="The-ITS-definition-namespace"
 xmlns:its="The-ITS-instance-namespace"
 xmlns:xml="http://www.w3.org/XML/1998/namespace"
 elementFormDefault="qualified">
...
<xs:element name="para" itsdef:translatable="yes">
 <xs:annotation>
  <xs:documentation its:translate="no">This comment 
is not to be translated.</xs:documentation>
 </xs:annotation>
 <xs:complexType mixed="true">
  <xs:complexContent>
   <xs:extension base="Inline">
    <xs:attributeGroup ref="attrs"/>
   </xs:extension>
  </xs:complexContent>
 </xs:complexType>
</xs:element>
...
</xs:schema>

4. Summary

The different levels of translatability can be summarized as follow:

Use the mapping mechanism to associate markup of the original format with ITS functions.
Assign translatability information in the schema of the original format.
If none of the previous action can be done, associate the elements and attributes of the original format to ITS properties using an external definition file, or insert such block of definition at the top of the document.
If the general translatability information for the given format need to be overridden for specific instances, use in-document markup for this: use the original equivalent to ITS when available, otherwise use the ITS namespace.