5.9. ITS Module

5.9.1. Introduction

This section defines how the data categories of the Internationalization Tag set 2.0 [ITS] are represented in XLIFF.

For guidelines on how to extract original data annotated with ITS (e.g. an HTML5 or an XML file) please, see the appendix [B. Guidelines for Extraction with ITS].

ITS 2.0 is composed of 19 data categories, they are represented different ways:

The namespace for the ITS module is: urn:oasis:names:tc:xliff:itsm:2.1. and the recommended prefix is itsm.

The fragment identification prefix for the ITS module is: itsm.

The semantics of the attributes are analogical to their counterparts in the W3C ITS namespace in case those counterparts exist. The main semantic difference between its and itsm attributes is that itsm attributes can apply on non-wellformed spans that are delimited by empty boundary markers <sm/>/<em/>.

The elements and attributes defined in the ITS module are equivalent to their counterparts in the W3C ITS namespace when these counterparts exist. They use the same names and values. they also have the same semantics, with the addition that the ITS module attributes can apply on non-wellformed spans delimited by the empty boundary markers <sm/> and <em/>.

5.9.2 Annotators Reference

ITS 2.0 provides a [tools annotation mechanism]. It identifies the processor that generates ITS information. This information is mandatory for the [MT Confidence] data category, as well as for [Terminology] and [Text Analysis] if they provide confidence information. It is optional for other data categories.

In XLIFF the tool annotation is represented using the itsm:annotatorsRef attribute. The attribute is allowed on the <xliff>, <file>, <group>, <unit>, <mrk> and <sm/> elements. Its values and semantics are the same as its:annotatorsRef (with the <sm/> addition).

5.9.3 Data Category Representation

5.9.3.1 Translate

The [Translate data category] indicates whether a content is translatable or not.

It is represented with the [translate] attribute of the Core.

5.9.3.2 Localization Note

// Defines how localization note is represented in XLIFF

5.9.3.3 Terminology

The [Terminology data category] is used to denote terms and optionally associates them with information, such as definitions.

It is represented with the ITS Terminology annotation:

Usage:

Constraints:

If the annotation has an itsm:termConfidence attribute, it must be within the scope of an itsm:annotatorsRef with the terminology annotator set.

Example:

<unit id='1' its:annotatorsRef='terminology|http://www.cngl.ie/termchecker'>
<segment>
<source>Text with a <pc id='1'><mrk id='m1' type='term'
itsm:termInfoRef='http://en.wikipedia.org/wiki/Terminology'
itsm:termConfidence='0.9'>term</mrk></pc>.</source>
</segment>
</unit>

5.9.3.N Etc...

// Etc...

5.9.3 Processing XLIFF with ITS processors

// Describes how it the content should be transformed. And provides the rules file.

// Felix has drafted text for this in the wiki

// ...

Appendix B. Extracting Data with ITS (Informative)

B.1 Translate

B.2 Localization Note

B.3 Terminology

Etc...