- From: CVS User fsasaki <cvsmail@w3.org>
- Date: Tue, 11 Jun 2013 21:55:28 +0000
- To: public-multilingualweb-lt-commits@w3.org
Update of /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20 In directory gil:/tmp/cvs-serv31374 Modified Files: its20-for-editing-sec1-sec2.html its20-for-editing-sec1-sec2.odd Log Message: more sec1-2 edits, ready for group review --- /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20-for-editing-sec1-sec2.html 2013/06/11 08:26:03 1.19 +++ /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20-for-editing-sec1-sec2.html 2013/06/11 21:55:27 1.20 @@ -28,9 +28,9 @@ </div> <div class="toc2">2.3 <a href="#basic-concepts-overinher" shape="rect">Overriding, Inheritance and Defaults</a></div> <div class="toc2">2.4 <a href="#basic-concepts-addingpointing" shape="rect">Adding Information or Pointing to Existing Information</a></div> -<div class="toc2">2.5 <a href="#specific-HTML-support" shape="rect">Specific HTML support</a><div class="toc3">2.5.1 <a href="#html5-reference-global-rules" shape="rect">Referencing global rules</a></div> -<div class="toc3">2.5.2 <a href="#html5-its-local-markup" shape="rect">Specifities of inserting local ITS 2.0 data categories</a></div> -<div class="toc3">2.5.3 <a href="#html5-existing-markup-versus-its" shape="rect">Relation between HTML markup and ITS 2.0 data categories</a></div> +<div class="toc2">2.5 <a href="#specific-HTML-support" shape="rect">Specific HTML support</a><div class="toc3">2.5.1 <a href="#html5-global-approach" shape="rect">Global approach in HTML5</a></div> +<div class="toc3">2.5.2 <a href="#html5-its-local-markup" shape="rect">Local approach</a></div> +<div class="toc3">2.5.3 <a href="#html5-existing-markup-versus-its" shape="rect">HTML markup with ITS 2.0 counterparts</a></div> <div class="toc3">2.5.4 <a href="#html5-standoff-markup-explanation" shape="rect">Standoff Markup in HTML5</a></div> <div class="toc3">2.5.5 <a href="#usage-in-legacy-html" shape="rect">Version of HTML</a></div> </div> @@ -38,8 +38,7 @@ <div class="toc2">2.7 <a href="#mapping-conversion" shape="rect">Mapping and conversion</a><div class="toc3">2.7.1 <a href="#mapping-NIF" shape="rect">ITS and RDF/NIF</a></div> <div class="toc3">2.7.2 <a href="#mapping-XLIFF" shape="rect">ITS and XLIFF</a></div> </div> -<div class="toc2">2.8 <a href="#datacategories-summary" shape="rect">Summary: ITS 2.0 data categories</a></div> -<div class="toc2">2.9 <a href="#implementing-its20" shape="rect">Implementing ITS 2.0</a></div> +<div class="toc2">2.8 <a href="#implementing-its20" shape="rect">ITS 2.0 Implementations and Conformance</a></div> </div> <div class="toc1">3 <a href="#notation-terminology" shape="rect">Notation and Terminology</a><div class="toc2">3.1 <a href="#notation" shape="rect">Notation</a></div> <div class="toc2">3.2 <a href="#def-datacat" shape="rect">Data category</a></div> @@ -60,9 +59,9 @@ <div class="toc3">5.2.2 <a href="#selection-local" shape="rect">Local Selection in an XML Document</a></div> </div> <div class="toc2">5.3 <a href="#selectors" shape="rect">Query Language of Selectors</a><div class="toc3">5.3.1 <a href="#queryLanguage" shape="rect">Choosing Query Language</a></div> -<div class="toc3">5.3.2 <a href="#d0e2457" shape="rect">XPath 1.0</a></div> +<div class="toc3">5.3.2 <a href="#d0e2527" shape="rect">XPath 1.0</a></div> <div class="toc3">5.3.3 <a href="#css-selectors" shape="rect">CSS Selectors</a></div> -<div class="toc3">5.3.4 <a href="#d0e2702" shape="rect">Additional query languages</a></div> +<div class="toc3">5.3.4 <a href="#d0e2772" shape="rect">Additional query languages</a></div> <div class="toc3">5.3.5 <a href="#its-param" shape="rect">Variables in selectors</a></div> </div> <div class="toc2">5.4 <a href="#link-external-rules" shape="rect">Link to External Rules</a></div> @@ -375,7 +374,7 @@ <a href="http://www.w3.org/TR/2007/REC-its-20070403/#introduction" shape="rect">introduction</a> states: “ITS is a technology to easily create XML which is internationalized and can be localized effectively”. In order to make this tangible, ITS 1.0 provided examples for <a href="http://www.w3.org/TR/2007/REC-its-20070403/#users-usage" shape="rect">users and usages</a>. Implicitly, these examples carried the information that ITS covers two areas: one that is related to the static dimension of mono-lingual content, and one that is related to the dynamic dimension of multi-lingual production.</p><ul><li><p>Static mono-lingual (the area for example of content authors): This part of the content has the directionality “right-to-left”.</p></li><li><p>Dynamic multi-lingual: (the area for example of machine translation systems): This part of the content must not be translated.</p></li></ul><p>Although ITS 1.0 made no assumptions about possible phases in a multilingual production process chai, it was slanted towards a simple three phase “write->internationalize->translate” model. Even a birds-eye-view at ITS 2.0 shows that ITS 2.0 explicitly targets a much more comprehensive model for multi-lingual content production. The model comprises support for multi-lingual content production phases such as:</p><ul><li><p>Internationalization</p></li><li><p>Pre-production (e.g. related to marking terminology)</p></li><li><p>Automated content enrichment (e.g. automatic hyperlinking for entities)</p></li><li><p>Extraction/filtering of translation-relevant content</p></li><li><p>Segmentation</p></li><li><p>Leveraging (e.g. of existing translation-related assets such as translation memories)</p></li><li><p>Machine Translation (e.g. geared towards a specific domain)</p></li><li><p>Quality assessment or control of source language or target language content</p></li><li><p>Generation of translation kits (e.g. packages based on XLIFF)</p></li><li><p>Post-production</p></li><li><p>Publishing</p></li></ul>p>The document <a title="Metadata for the Multilingual Web - Usage Scenarios and Implementations " href="#mlw-metadata-us-impl" shape="rect">[MLW US IMPL]</a> lists a large variety of usage scenarios for ITS 2.0. Most of them are composed of several of the aforementioned phases.</p><p>In a similar vein, ITS 2.0 takes a much more comprehensive view on the actors that may participate in a multi-lingual content production process. ITS 1.0 annotations (e.g. local markup for the <a href="#terminology" shape="rect">Terminology</a> data category) most of the time were conceived as being closely tied to human actors such as content authors or information architects. ITS 2.0 raises non-human actors such as word processors/editors, content management systems, machine translation systems, term candidate generators, entity identifiers/disambiguators to the same level. This change amongst others is reflected by the ITS 2.0 <a href="#its-tool-annotation" shape="rect">Tool Annotation</a> which allows systems to record tha they have processed as certain part of content.</p></div><div class="div2"> <h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="high-level-differences-between-1.0-and-2.0" id="high-level-differences-between-1.0-and-2.0" shape="rect"/>1.4 High-level differences between ITS 1.0 and ITS 2.0</h3><p>The differences between ITS 1.0 and ITS 2.0 can be summarized as follows.</p><p> <em>Coverage of <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>: </em>ITS 1.0 can be applied to XML content. ITS 2.0 extends the coverage to <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>. Explanatory details about ITS 2.0 and <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> are given in <a class="section-ref" href="#specific-HTML-support" shape="rect">Section 2.5: Specific HTML support</a>.</p><p> - <em>Addition of data categories</em>: ITS 2.0 provides additional data categories and modifies existing ones. A summary of all ITS 2.0 data categories are given in <a class="section-ref" href="#datacategories-summary" shape="rect">Section 2.8: Summary: ITS 2.0 data categories</a>.</p><p> + <em>Addition of data categories</em>: ITS 2.0 provides additional data categories and modifies existing ones. A summary of all ITS 2.0 data categories are given in <a class="section-ref" href="#basic-concepts-datacategories" shape="rect">Section 2.1: Data Categories</a>.</p><p> <em>Modification of data categories</em>:</p><ul><li><p id="ruby-in-its2">ITS 1.0 provided the <a href="http://www.w3.org/TR/2007/REC-its-20070403/#ruby-annotation" shape="rect">Ruby data category</a>. ITS 2.0 does not provide ruby since at the time of writing, because of the the <a href="http://www.w3.org/TR/html51/text-level-semantics.html#the-ruby-element" shape="rect">ruby model in HTML5</a> was still under development. Once these discussions are settled, the Ruby data category possibly will be re-introduced, in a subsequent version of ITS.</p></li><li><p>The <a href="#directionality" shape="rect">Directionality</a> data category reflects directionality markup in <a title="HTML 4.01" href="#html4" shape="rect">[HTML 4.01]</a>. The reason is that enhancements are being discussed in the context of HTML5 that are expected to change the approach to marking up directionality, in particular to support content whose directionality needs to be isolated from that of surrounding content. However, hese enhancements are not finalized yet. They will be reflected in a future revision of ITS.</p></li></ul><p> <em>Additional or modified mechanisms:</em> The following mechanisms from ITS 1.0 have been modified or added to ITS 2.0.</p><ul><li><p id="query-language-on-rules-element">ITS 1.0 used only XPath as the mechanism for selecting nodes in <a href="#basic-concepts-selection-global" shape="rect">global rules</a>. ITS 2.0 allows for choosing the <a href="#selectors" shape="rect">query language of selectors</a>. The default is XPath 1.0. An ITS 2.0 processor is free to support other selection mechanisms, like CSS selectors or other versions of XPath.</p></li><li><p id="parameters-in-selector">In global rules it is now possible to set <a href="#its-param" shape="rect">variables for the selectors</a> (XPath expression). The <code class="its-elem-markup">param</code> element serves this purpose.</p></li><li><p>ITS 2.0 has a <a href="#its-tool-annotation" shape="rect">ITS Tools Annotation</a> mechanism to associate processor information with the use of individual data categories. See <a class="sectio-ref" href="#traceability" shape="rect">Section 2.6: Traceability</a> for details.</p></li></ul><p> <em>Mappings:</em> ITS 2.0 provides a normative algorithm to convert ITS 2.0 information into <a title="" href="#nif-reference" shape="rect">[NIF]</a> and links to guidance about how to relate ITS 2.0 to XLIFF. See <a class="section-ref" href="#mapping-conversion" shape="rect">Section 2.7: Mapping and conversion</a> for details.</p><p> @@ -385,7 +384,9 @@ <em>This section is informative.</em> </p><p>The purpose of this section is to provide basic knowledge about how ITS 2.0 “works”. Detailed knowledge (including formal definitions) is given in the subsequent sections.</p><div class="div2"> <h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="basic-concepts-datacategories" id="basic-concepts-datacategories" shape="rect"/>2.1 Data Categories</h3><p>A key concept of ITS is the abstract notion of <a href="#def-datacat" shape="rect">data categories</a>. Data categories define the information that can be conveyed via ITS. An example is the <a href="#trans-datacat" shape="rect">Translate</a> data category. It conveys information about translatability of content.</p><p> - <a class="section-ref" href="#datacategory-description" shape="rect">Section 8: Description of Data Categories</a> defines data categories. It also describes their implementation, that is: ways to use them for example in an XML context. The motivation for separating data category definitions from their implementation is that only this way the reality can be reflected since data categories can be implemented</p><ul><li><p>In various types of content (XML in general or <a href="#specific-HTML-support" shape="rect">HTML</a>).</p></li><li><p>For a single piece of content, e.g. a <code>p</code> element. This is the so-called <a href="#basic-concepts-selection-local" shape="rect">local approach</a>.</p></li><li><p>for several pieces of content in one document or even a set of documents. This is the so-called <a href="#basic-concepts-selection-global" shape="rect">global approach</a>.</p></li><li><p>For a complete markup vocabulary. This is done by adding <a href="#its-schemas" shape="rect">ITS makup declarations</a> to the schema for the vocabulary.</p></li></ul></div><div class="div2"> + <a class="section-ref" href="#datacategory-description" shape="rect">Section 8: Description of Data Categories</a> defines data categories. It also describes their implementation, that is: ways to use them for example in an XML context. The motivation for separating data category definitions from their implementation is that only this way the reality can be reflected since data categories can be implemented</p><ul><li><p>In various types of content (XML in general or <a href="#specific-HTML-support" shape="rect">HTML</a>).</p></li><li><p>For a single piece of content, e.g. a <code>p</code> element. This is the so-called <a href="#basic-concepts-selection-local" shape="rect">local approach</a>.</p></li><li><p>for several pieces of content in one document or even a set of documents. This is the so-called <a href="#basic-concepts-selection-global" shape="rect">global approach</a>.</p></li><li><p>For a complete markup vocabulary. This is done by adding <a href="#its-schemas" shape="rect">ITS makup declarations</a> to the schema for the vocabulary.</p></li></ul><p>ITS 2.0 provides the following data categories, using most of the existing ITS 1.0 data categories and adding new ones. Modifications of existing ITS 1.0 data categories are summarized in <a class="section-ref" href="#high-level-differences-between-1.0-and-2.0" shape="rect">Section 1.4: High-level differences between ITS 1.0 and ITS 2.0</a>.</p><ul><li><p><a href="#trans-datacat" shape="rect">Translate</a>: express information about whether a selected piece of content should be translated or not.</p></li><li><p><a href="#locNote-datacat" shape="rect">Localization Note</a>: communicate notes to localizers about a particular item of content.</p></li><li><p><a href="#terminology" shape="rect">Terminology</a>: mark terms and optionally associate them with information, such as definitions or references to a term data base.</p></li><li><p><a href="#directionality" shape="rect">Directionality</a>: specify the base writing direction of blocks, mbeddings and overrides for the Unicode bidirectional algorithm.</p></li><li><p><a href="#language-information" shape="rect">Language Information</a>: express the language of a given piece of content.</p></li><li><p><a href="#elements-within-text" shape="rect">Elements Witin Text:</a> express how content of an element is related to the text flow (constitute its own segment like paragraphs, be part of a segment like emphasis marker etc.).</p></li><li><p><a href="#domain" shape="rect">Domain</a>: identify the topic or subject of the annotated content for translation-related applications.</p></li><li><p><a href="#textanalysis" shape="rect">Text Analysis</a>: annotate content with lexical or conceptual information (e.g. for the purpose of contextual disambiguation).</p></li><li><p><a href="#LocaleFilter" shape="rect">Locale Filter</a>: specify that a piece of content is only applicable to certain locales. </p></li><li><p><a href="#provenance" shape="rect">Provenance</a>: communicate the identity of agents that ave been involved processing content.</p></li><li><p><a href="#externalresource" shape="rect">External Resource</a>: indicate reference points in a resource outside the document that need to be considered during localization or translation. Examples of such resources are external images and audio or video files.</p></li><li><p><a href="#target-pointer" shape="rect">Target Pointer</a>: associate the markup node of a given source content (i.e. the content to be translated) and the markup node of its corresponding target content (i.e. the source content translated into a given target language). This is relevant for formats that hold the same content in different languages inside a single document.</p></li><li><p><a href="#idvalue" shape="rect">Id Value</a>: identify a value that can be used as unique identifier for a given part of the content. + </p></li><li><p><a href="#preservespace" shape="rect">Preserve Space</a>: indicate how whitespace should be handled in content.</p></li><li><p><a href="#lqissue" shape="rect">Localization Quality Issue</a>: describe the nature and severity of an error detected during a language-oriented quality assurance (QA) process.</p></li><li><p><a href="#lqrating" shape="rect">Localization Quality Rating</a>: express an overall measurement of the localization quality of a document or an item in a document.</p></li><li><p><a href="#mtconfidence" shape="rect">MT Confidence</a>: indicate the confidence that MT systems provide about their translation. + </p></li><li><p><a href="#allowedchars" shape="rect">Allowed Characters</a>: specify the characters that are permitted in a given piece of content.</p></li><li><p><a href="#storagesize" shape="rect">Storage Size</a>: specify the maximum storage size of a given content.</p></li></ul></div><div class="div2"> <h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="basic-concepts-selection" id="basic-concepts-selection" shape="rect"/>2.2 Selection</h3><p>Information (e.g. “translate this”) captured by an ITS data category always pertains to one or more XML or HTML nodes, primarily element and attribute nodes. In a sense, the relevant node(s) get “selected”. Selection may be explicit or implicit. ITS distinguishes two mechanisms for explicit selection: (1) local approach, and (2) global approach (via <code class="its-elem-markup">rules</code>). Both local and global approach can interact with each other, and with additional ITS dimensions such as inheritance and defaults.</p><p>The mechanisms defined for ITS selection resemble those defined in <a title="Cascading Style Sheets,
 level 2 revision 1 CSS 2.1 Specification" href="#css2-1" shape="rect">[CSS 21]</a>. The local approach can be compared to the <code>style</code> attribute in HTML/XHTML, and the global approach is similar to the <code>style</code> element in HTML/XHTML.</p><ul><li><p>the local approach puts ITS markup in the relevant element of the host vocabulary (e.g. the <code>author</code> element in DocBook)</p></li><li><p>the global, <a href="#selection-global" shape="rect">rule-based approach</a> puts the ITS markup in elements defined by ITS itself (namely the <code class="its-elem-markup">rules</code> element)</p></li></ul><p>ITS usually uses XPath in rules for identifying nodes although CSS Selectors and other query languages can in addition be implemented by applications.</p><p>ITS 2.0 can be used with XML documents (e.g. a DocBook article), HTML documents, or schemas (e.g. an XML Schema document for a proprietary document format).</p><p>The following two examples provide more details about the distinction between the local and global approach, using the <a href="#trans-datacat" shape="rect">Translate</a> data category as example.</p><div class="div3"> @@ -422,7 +423,7 @@ <strong class="hl-tag" style="color: #000096"></body></strong> <strong class="hl-tag" style="color: #000096"></myTopic></strong> -</pre></div><p>[Source file: <a href="examples/xml/EX-basic-concepts-2.xml" shape="rect">examples/xml/EX-basic-concepts-2.xml</a>]</p></div><p>For the global approach (and <a href="#EX-basic-concepts-2" shape="rect">Example 4</a>) to work, a schema developer may need to add a <code class="its-elem-markup">rules</code> element and associated markup to the schema. In some cases, global rules may be sufficient and other ITS markup (such as an <code class="its-attr-markup">translate</code> attribute on the elements and attributes) may not be needed in the schema. However, it is likely that authors may need the local approach from time to time to override the general rule.</p><p>For specification of the <a href="#trans-datacat" shape="rect">Translate</a> data category information, the contents of the <code class="its-elem-markup">translateRule</code> element would normally be designed by an information architect familiar with the document format and familiar with, or working with someone familiar with, the nees of the localization/translation group.</p><p>The global, rule-based approach has the following benefits:</p><ul><li><p>Content authors do not have to concern themselves with creating additional +</pre></div><p>[Source file: <a href="examples/xml/EX-basic-concepts-2.xml" shape="rect">examples/xml/EX-basic-concepts-2.xml</a>]</p></div><p>For the global approach (and <a href="#EX-basic-concepts-2" shape="rect">Example 4</a>) to work, a schema developer may need to add a <code class="its-elem-markup">rules</code> element and associated markup to the schema. In some cases, global rules may be sufficient and other ITS markup (such as an <code class="its-attr-markup">translate</code> attribute on the elements and attributes) may not be needed in the schema. However, it is likely that authors may need the local approach from time to time to override the general rule.</p><p>For specification of the <a href="#trans-datacat" shape="rect">Translate</a> data category information, the contents of the <code class="its-elem-markup">translateRule</code> element would normally be designed by an information architect familiar with the document format and familiar with, or working with someone familiar with, the nees of localization/translation.</p><p>The global, rule-based approach has the following benefits:</p><ul><li><p>Content authors do not have to concern themselves with creating additional markup or verifying that the markup was applied correctly. ITS data categories are associated with sets of nodes (for example all <code>p</code> elements in an XML instance)</p></li><li><p>Changes can be made in a single location, rather than by searching and modifying @@ -431,15 +432,7 @@ <code>term</code> element in DITA)</p></li></ul><p>The commonality in both examples above is the markup <code>translate='no'</code>. This piece of ITS markup can be interpreted as follows:</p><ul><li><p>it pertains to the <a href="#trans-datacat" shape="rect">Translate</a> data category </p></li><li><p>the attribute <code class="its-attr-markup">translate</code> holds a value of "no"</p></li></ul></div></div><div class="div2"> <h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="basic-concepts-overinher" id="basic-concepts-overinher" shape="rect"/>2.3 Overriding, Inheritance and Defaults</h3><p>The power of the ITS selection mechanisms comes at a price: rules related to <a href="#selection-precedence" shape="rect">overriding/precedence</a>, and <a href="#datacategories-defaults-etc" shape="rect">inheritance</a>, have to be established.</p><p>The document in <a href="#EX-basic-concepts-3" shape="rect">Example 5</a> shows how inheritance - and overriding work for the <a href="#trans-datacat" shape="rect">Translate</a> data category. - By default elements are translatable. Here, the <code class="its-elem-markup">translateRule</code> element declared - in the header overrides the default for the <code>head</code> element inside - <code>text</code> and for all its children. Because the <code>title</code> element is - actually translatable, the global rule needs to be overridden by a local - <code>its:translate="yes"</code>. Note that the global rule is processed first, - regardless of its position inside the document. In the main body of the document, the - default applies, and here it is <code>its:translate="no"</code> that is used to set - “faux pas” as non-translatable.</p><div class="exampleOuter"><div class="exampleHeader"><a name="EX-basic-concepts-3" id="EX-basic-concepts-3" shape="rect"/>Example 5: Overriding and Inheritance</div><div class="exampleInner"><pre xml:space="preserve"><strong class="hl-tag" style="color: #000096"><text</strong> <span class="hl-attribute" style="color: #F5844C">xmlns:its</span>=<span class="hl-value" style="color: #993300">"http://www.w3.org/2005/11/its"</span><strong class="hl-tag" style="color: #000096">></strong> + and overriding work for the <a href="#trans-datacat" shape="rect">Translate</a> data category:</p><ul><li><p>The ITS default is that all elements are translatable.</p></li><li><p>The <code class="its-elem-markup">translateRule</code> element declared in the header overrides the default for the <code>head</code> element inside text and for all its children.</p></li><li><p>Because the <code>title</code> element is actually translatable, the global rule needs to be overridden by a local <code>its:translate="yes"</code>.</p></li><li><p>In the body of the document the default applies, and <code>its:translate="no"</code> is used to set "faux pas" as non-translatable.</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-basic-concepts-3" id="EX-basic-concepts-3" shape="rect"/>Example 5: Overriding and Inheritance</div><div class="exampleInner"><pre xml:space="preserve"><strong class="hl-tag" style="color: #000096"><text</strong> <span class="hl-attribute" style="color: #F544C">xmlns:its</span>=<span class="hl-value" style="color: #993300">"http://www.w3.org/2005/11/its"</span><strong class="hl-tag" style="color: #000096">></strong> <strong class="hl-tag" style="color: #000096"><head></strong> <strong class="hl-tag" style="color: #000096"><revision></strong>Sep-10-2006 v5<strong class="hl-tag" style="color: #000096"></revision></strong> <strong class="hl-tag" style="color: #000096"><author></strong>Ealasaidh McIan<strong class="hl-tag" style="color: #000096"></author></strong> @@ -458,20 +451,18 @@ <strong class="hl-tag" style="color: #000096"></div></strong> <strong class="hl-tag" style="color: #000096"></body></strong> <strong class="hl-tag" style="color: #000096"></text></strong> -</pre></div><p>[Source file: <a href="examples/xml/EX-basic-concepts-3.xml" shape="rect">examples/xml/EX-basic-concepts-3.xml</a>]</p></div><p>For XML content, <a href="#datacategories-overview" shape="rect">data category specific defaults</a> are provided. These are independent of the actual XML markup vocabulary. For <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>, several HTML5 elements and attributes map exactly to ITS 2.0 data categories. Hence, that HTML markup is normatively interpreted as ITS 2.0 data category information: See <a class="section-ref" href="#html5-existing-markup-versus-its" shape="rect">Section 2.5.3: Relation between HTML markup and ITS 2.0 data categories</a> for more information.</p></div><div class="div2"> -<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="basic-concepts-addingpointing" id="basic-concepts-addingpointing" shape="rect"/>2.4 Adding Information or Pointing to Existing Information</h3><p>For some data categories, special attributes add or point to information about the - selected nodes. For example, the <a href="#locNote-datacat" shape="rect">Localization Note</a> +</pre></div><p>[Source file: <a href="examples/xml/EX-basic-concepts-3.xml" shape="rect">examples/xml/EX-basic-concepts-3.xml</a>]</p></div><p>For XML content, <a href="#datacategories-overview" shape="rect">data category specific defaults</a> are provided. These are independent of the actual XML markup vocabulary. Example for the <a href="#trans-datacat" shape="rect">Translate</a> data category: <code>translate="yes"</code> for elements, and <code>translate="no"</code> for attributes.</p><p>For <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>, several HTML5 elements and attributes map exactly to ITS 2.0 data categories. Hence, that HTML markup is normatively interpreted as ITS 2.0 data category information (see <a class="section-ref" href="#html5-existing-markup-versus-its" shape="rect">Section 2.5.3: HTML markup with ITS 2.0 counterparts</a> for more information).</p></div><div class="div2"> +<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="basic-concepts-addingpointing" id="basic-concepts-addingpointing" shape="rect"/>2.4 Adding Information or Pointing to Existing Information</h3><p>Data categories can add information or point to information for the selected nodes. For example, the <a href="#locNote-datacat" shape="rect">Localization Note</a> data category can add information to selected nodes (using a <code class="its-elem-markup">locNote</code> element), or point to existing information elsewhere in the document (using a - <code class="its-attr-markup">locNotePointer</code> attribute).</p><p>The <a href="#datacategories-overview" shape="rect">data category overview table</a>, in <a class="section-ref" href="#datacategories-defaults-etc" shape="rect">Section 8.1: Position, Defaults, Inheritance and Overriding of Data Categories</a>, provides an overview of what - data categories allow to point to existing information or to add information.</p><p>The functionalities of adding information and pointing to existing information are - <em>mutually exclusive</em>. That is to say, attributes for pointing and adding + <code class="its-attr-markup">locNotePointer</code> attribute).</p><p>The <a href="#datacategories-overview" shape="rect">data category overview table</a>, in <a class="section-ref" href="#datacategories-defaults-etc" shape="rect">Section 8.1: Position, Defaults, Inheritance and Overriding of Data Categories</a>, provides an overview of which + data categories allow to add information, and which allow to point to existing information.</p><p>Adding information and pointing to existing information are + <em>mutually exclusive</em>: attributes for adding information and attributes for pointing to the same information must not appear at the same rule element.</p></div><div class="div2"> -<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="specific-HTML-support" id="specific-HTML-support" shape="rect"/>2.5 Specific HTML support</h3><p>For applying ITS 2.0 data categories to HTML, five aspects must be considered:</p><ol class="depth1"><li><p>referencing global rules</p></li><li><p>specifities of inserting local ITS 2.0 data categories</p></li><li><p>relationship between HTML markup and data categories,</p></li><li><p>standoff markup in HTML5</p></li><li><p>HTML version.</p></li></ol><p>In the following sections these aspects are briefly discussed.</p><div class="div3"> -<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-reference-global-rules" id="html5-reference-global-rules" shape="rect"/>2.5.1 Referencing global rules</h4><p>To account for the so-called “<a href="#basic-concepts-selection-global" shape="rect">global - approach</a>” in HTML, this specification (see <a class="section-ref" href="#html5-global-rules" shape="rect">Section 6.2: Global rules</a>) defines a link type for referring to external files - with global rules and an approach to have inline global rules in the HTML <code>script</code> element. - It is preferred to use external global rules linked via the <code>link</code> element than to have inline global rules in the HTML document.</p><div class="exampleOuter"><div class="exampleHeader"><a name="EX-translate-html5-global-1" id="EX-translate-html5-global-1" shape="rect"/>Example 6: Using ITS global rules in HTML</div><p>The <code>link</code> element points to the rules file +<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="specific-HTML-support" id="specific-HTML-support" shape="rect"/>2.5 Specific HTML support</h3><p>For applying ITS 2.0 data categories to HTML, five aspects must be considered:</p><ol class="depth1"><li><p>global approach</p></li><li><p>local approach</p></li><li><p>HTML markup with ITS 2.0 counterparts</p></li><li><p>standoff markup in HTML5</p></li><li><p>HTML version</p></li></ol><p>In the following sections these aspects are briefly discussed.</p><div class="div3"> +<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-global-approach" id="html5-global-approach" shape="rect"/>2.5.1 Global approach in HTML5</h4><p>To account for the so-called <a href="#basic-concepts-selection-global" shape="rect">global + approach</a> in HTML, this specification (see <a class="section-ref" href="#html5-global-rules" shape="rect">Section 6.2: Global rules</a>) defines + </p><ul><li><p>a link type for referring to external files with global rules from a <code>link</code> element</p></li><li><p>an approach to have inline global rules in the HTML <code>script</code> element.</p></li></ul><p>It is preferred to use external global rules linked via the <code>link</code> element than to have inline global rules in the HTML document.</p><div class="exampleOuter"><div class="exampleHeader"><a name="EX-translate-html5-global-1" id="EX-translate-html5-global-1" shape="rect"/>Example 6: Using ITS global rules in HTML</div><p>The <code>link</code> element points to the rules file <code>EX-translateRule-html5-1.xml</code> The <code>rel</code> attribute identifies the ITS specific link relation <code>its-rules</code>.</p><div class="exampleInner"><pre xml:space="preserve"><strong class="hl-tag" style="color: blue"><!DOCTYPE html></strong> <strong class="hl-tag" style="color: #000096"><html></strong> @@ -490,32 +481,31 @@ <strong class="hl-tag" style="color: #000096"><its:translateRule</strong> <span class="hl-attribute" style="color: #F5844C">translate</span>=<span class="hl-value" style="color: #993300">"no"</span> <span class="hl-attribute" style="color: #F5844C">selector</span>=<span class="hl-value" style="color: #993300">"//h:code"</span><strong class="hl-tag" style="color: #000096">/></strong> <strong class="hl-tag" style="color: #000096"></its:rules></strong> </pre></div><p>[Source file: <a href="examples/html5/EX-translateRule-html5-1.xml" shape="rect">examples/html5/EX-translateRule-html5-1.xml</a>]</p></div></div><div class="div3"> -<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-its-local-markup" id="html5-its-local-markup" shape="rect"/>2.5.2 Specifities of inserting local ITS 2.0 data categories</h4><p>In HTML, an ITS 2.0 local data category is realized with the specific prefix <code>its-*</code>. +<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-its-local-markup" id="html5-its-local-markup" shape="rect"/>2.5.2 Local approach</h4><p>In HTML, an ITS 2.0 local data category is realized with the prefix <code>its-*</code>. The general mapping of the XML based ITS 2.0 attributes to their HTML <code>its-*</code> counterparts is defined in <a class="section-ref" href="#html5-local-attributes" shape="rect">Section 6.1: Mapping of Local Data Categories to HTML</a>. An informative table in <a class="section-ref" href="#list-of-elements-and-attributes" shape="rect">Appendix G: List of ITS 2.0 Global Elements and Local Attributes</a> provides an overview of the mapping for all data categories.</p></div><div class="div3"> -<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-existing-markup-versus-its" id="html5-existing-markup-versus-its" shape="rect"/>2.5.3 Relation between HTML markup and ITS 2.0 data categories</h4><p>There are four ITS 2.0 data categories, which have direct counterparts - in HTML markup. For theses data categories, ITS 2.0 defines the following specific - behaviour:</p><ul><li><p>The <a href="#language-information" shape="rect">Language Information</a> data category has the HTML <code>lang</code> - attribute counterpart; in XHTML this is the <code>xml:lang</code> attribute. These attributes act as +<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-existing-markup-versus-its" id="html5-existing-markup-versus-its" shape="rect"/>2.5.3 HTML markup with ITS 2.0 counterparts</h4><p>There are four ITS 2.0 data categories, which have counterparts in HTML markup. Put differently: native HTML markup reveals information for some ITS 2.0 data categories. For these data categories, ITS 2.0 defines the following:</p><ul><li><p>The <a href="#language-information" shape="rect">Language Information</a> data category has the HTML <code>lang</code> + attribute as counterpart. In XHTML the counterpart is the <code>xml:lang</code> attribute. These attributes act as local markup for the <a href="#language-information" shape="rect">Language Information</a> data category in HTML and - take <a href="#selection-precedence" shape="rect">precedence</a> over language information conveyed via a global <code class="its-elem-markup">langRule</code>.</p></li><li><p>The <a href="#idvalue" shape="rect">Id Value</a> data category has the HTML or XHTML <code>id</code> attribute. - This attribute acts as local markup for the <a href="#idvalue" shape="rect">Id Value</a> data category in HTML and take <a href="#selection-precedence" shape="rect">precedence</a> over - id information conveyed via a global <code class="its-elem-markup">idValueRule</code>.</p></li><li><p>The <a href="#elements-within-text" shape="rect">Elements within Text</a> data category has a set of HTML - elements defined as <a href="http://www.w3.org/TR/html51/dom.html#phrasing-content-1" shape="rect">phrasing content</a>. In the absence of an + take <a href="#selection-precedence" shape="rect">precedence</a> over language information conveyed via a global <code class="its-elem-markup">langRule</code>.</p></li><li><p>The <a href="#idvalue" shape="rect">Id Value</a> data category has the HTML or XHTML <code>id</code> attribute as counterpart. + This attribute acts as local markup for the <a href="#idvalue" shape="rect">Id Value</a> data category in HTML and takes <a href="#selection-precedence" shape="rect">precedence</a> over + identifier information conveyed via a global <code class="its-elem-markup">idValueRule</code>.</p></li><li><p>The <a href="#elements-within-text" shape="rect">Elements within Text</a> data category has a set of HTML + elements (the so-called <a href="http://www.w3.org/TR/html51/dom.html#phrasing-content-1" shape="rect">phrasing content</a>) as counterpart. + In the absence of an <a href="#elements-within-text" shape="rect">Elements within Text</a> local attribute or global rules selecting the - element in question, these elements are always interpreted as - <code>withinText="yes"</code> by default, except for the elements <code class="its-elem-markup">iframe</code>, <code class="its-elem-markup">noscript</code>, <code class="its-elem-markup">script</code> - and <code class="its-elem-markup">textarea</code> which are interpreted as <code>withinText="nested"</code>.</p></li><li><p>The <a href="#trans-datacat" shape="rect">Translate</a> data category has a direct counterpart in - <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>, namely the HTML5 - <code>translate</code> attribute. ITS 2.0 does not define its own behaviour for HTML5 <code>translate</code>, but just refers to <a href="http://www.w3.org/TR/html51/dom.html#the-translate-attribute" shape="rect">the HTML5 definition</a>. The <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> definition also applies to nodes selected via global rules. That is, a <code class="its-elem-markup">translateRule</code> like <code><its:translateRule selector=""//h:img" translate="yes"/></code> will set the <code>img</code> element and its translatable attributes like <code>alt</code> to "yes".</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-its-and-existing-HTML5-markup" id="EX-its-and-existing-HTML5-markup" shape="rect"/>Example 8: The <a href="#language-information" shape="rect">Language Information</a>, <a href="#idvalue" shape="rect">Id Value</a>, + element in question, most of the phrasing content elements are interpreted as + <code>withinText="yes"</code> by default. The phrasing content elements <code class="its-elem-markup">iframe</code>, <code class="its-elem-markup">noscript</code>, <code class="its-elem-markup">script</code> + and <code class="its-elem-markup">textarea</code> are interpreted as <code>withinText="nested"</code>.</p></li><li><p>The <a href="#trans-datacat" shape="rect">Translate</a> data category has a direct counterpart in + <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>, namely the <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> + <code>translate</code> attribute. ITS 2.0 does not define its own behavior for <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> + <code>translate</code>, but just refers to <a href="http://www.w3.org/TR/html51/dom.html#the-translate-attribute" shape="rect">the HTML5 definition</a>. That definition also applies to nodes selected via global rules. That is, a <code class="its-elem-markup">translateRule</code> like <code><its:translateRule selector=""//h:img" translate="yes"/></code> will set the <code>img</code> element and its translatable attributes like <code>alt</code> to "yes".</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-its-and-existing-HTML5-markup" id="EX-its-and-existing-HTML5-markup" shape="rect"/>Example 8: The <a href="#language-information" shape="rect">Language Information</a>, <a href="#idvalue" shape="rect">Id Value</a>, <a href="#elements-within-text" shape="rect">Elements within Text</a> and <a href="#trans-datacat" shape="rect">Translate</a> - ITS 2.0 data categories used with - HTML native markup.</div><p>The <code>html</code> element is interpreted to convey the + ITS 2.0 data categories expressed by native HTML markup.</div><p>The <code>lang</code> attribute of the <code>html</code> element conveys the <a href="#language-information" shape="rect">Language Information</a> value "en". - The <code>p</code> element is interpreted to - convey the <a href="#idvalue" shape="rect">Id Value</a> of "p1". The elements <code>em</code> and <code>img</code> are interpreted to be <code>withinText="yes"</code>. The <code>p</code> element and its children is set to be non-translatable via an <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> - <code>translate</code> attribute. Here the <code>alt</code> attribute, normally translatable by default, will also be non-translatable.</p><div class="exampleInner"><pre xml:space="preserve"><strong class="hl-tag" style="color: blue"><!DOCTYPE html></strong> + The <code>id</code> attribute of the <code>p</code> element conveys the <a href="#idvalue" shape="rect">Id Value</a> + "p1". The elements <code>em</code> and <code>img</code> are interpreted to be <code>withinText="yes"</code>. The <code>p</code> element and its children are set to be non-translatable via an <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> + <code>translate</code> attribute. Via inheritance, the <code>alt</code> attribute, normally translatable by default, also is non-translatable.</p><div class="exampleInner"><pre xml:space="preserve"><strong class="hl-tag" style="color: blue"><!DOCTYPE html></strong> <strong class="hl-tag" style="color: #000096"><html</strong> <span class="hl-attribute" style="color: #F5844C">lang</span>=<span class="hl-value" style="color: #993300">en</span><strong class="hl-tag" style="color: #000096">></strong> <strong class="hl-tag" style="color: #000096"><head></strong> <strong class="hl-tag" style="color: #000096"><meta</strong> <span class="hl-attribute" style="color: #F5844C">charset</span>=<span class="hl-value" style="color: #993300">utf-8</span><strong class="hl-tag" style="color: #000096">></strong> @@ -525,29 +515,20 @@ <strong class="hl-tag" style="color: #000096"><p</strong> <span class="hl-attribute" style="color: #F5844C">id</span>=<span class="hl-value" style="color: #993300">"p1"</span> <span class="hl-attribute" style="color: #F5844C">translate</span>=<span class="hl-value" style="color: #993300">"no"</span><strong class="hl-tag" style="color: #000096">></strong>This is a <strong class="hl-tag" style="color: #000096"><em></strong>motherboard<strong class="hl-tag" style="color: #000096"></em></strong> and image: <strong class="hl-tag" style="color: #000096"><img</strong> <span class="hl-attribute" style="color: #F5844C">src</span>=<span class="hl-value" style="color: #993300">"http://example.com/myimg.png"</span> <span class="hl-attribute" style="color: #F5844C">alt</span>=<span class="hl-value" style="color: #993300">"My image"</span><strong class="hl-tag" style="color: #000096">/></strong>.<strong class="hl-tag" style="color: #000096"></p></strong> <strong class="hl-tag" style="color: #000096"></body></strong> -<strong class="hl-tag" style="color: #000096"></html></strong></pre></div><p>[Source file: <a href="examples/html5/EX-its-and-existing-HTML5-markup.html" shape="rect">examples/html5/EX-its-and-existing-HTML5-markup.html</a>]</p></div><p>There are also some HTML markup elements that have similar, but not always identical, roles and behaviour than certain ITS 2.0 data categories. - For example, the HTML <code>dfn</code> element - could be used to identify a term in the sense of the <a href="#terminology" shape="rect">Terminology</a> data - category. However, this is not always the case and it depends on the - intentions of the content author. To accomodate this situation, users - of ITS 2.0 are encouraged to specifiy the association of existing HTML - markup with a dedicated global rules file. For an example rules file see the - <a href="http://www.w3.org/TR/2008/NOTE-xml-i18n-bp-20080213/#relating-its-plus-xhtml" shape="rect">XML I18N Best Practices</a> document.</p></div><div class="div3"> -<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-standoff-markup-explanation" id="html5-standoff-markup-explanation" shape="rect"/>2.5.4 Standoff Markup in HTML5</h4><p>The <a href="#provenance" shape="rect">Provenance</a> and the <a href="#lqissue" shape="rect">Localization Quality Issue</a> data categories allow for using standoff markup. In HTML such standoff markup is put into a <code>script</code> element. The constraints for <a href="#provenance-records-in-html5-constraint" shape="rect">Provenance standoff</a> markup in HTML and <a href="#loc-quality-issues-in-html5-constraint" shape="rect">Localization quality issue</a> markup in HTML need to be taken into account. Examples of standoff markup in HTML for the two data categories are <a href="#EX-provenance-html5-local-2" shape="rect">Example 61</a> and <a href="#EX-locQualityIssue-html5-local-2" shape="rct">Example 76</a>.</p></div><div class="div3"> -<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="usage-in-legacy-html" id="usage-in-legacy-html" shape="rect"/>2.5.5 Version of HTML</h4><p>ITS 2.0 does not define how to use ITS in HTML versions prior version 5. Users are - encouraged to migrate their content to HTML5 or XHTML. While it is possible to use +<strong class="hl-tag" style="color: #000096"></html></strong></pre></div><p>[Source file: <a href="examples/html5/EX-its-and-existing-HTML5-markup.html" shape="rect">examples/html5/EX-its-and-existing-HTML5-markup.html</a>]</p></div><p>There are also some HTML markup elements that have or can have similar, but not necessarily identical, roles and behavior than certain ITS 2.0 data categories. For example, the HTML <code>dfn</code> element could be used to identify a term in the sense of the <a href="#terminology" shape="rect">Terminology</a> data category. However, this is not always the case and it depends on the intentions of the HTML content author. To accommodate this situation, users of ITS 2.0 are encouraged to specify the semantics of existing HTML markup in an ITS 2.0 context with a dedicated global rules file. Example: use a rule to define that the HTML <code>dfn</code> has the semantics of ITS <code>term="yes</code>. For additional examples see the <a href="http://www.w3.org/TR/2008/NOTE-xl-i18n-bp-20080213/#relating-its-plus-xhtml" shape="rect">XML I18N Best Practices</a> document.</p></div><div class="div3"> +<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-standoff-markup-explanation" id="html5-standoff-markup-explanation" shape="rect"/>2.5.4 Standoff Markup in HTML5</h4><p>The <a href="#provenance" shape="rect">Provenance</a> and the <a href="#lqissue" shape="rect">Localization Quality Issue</a> data categories allow for using so-called standoff markup, see the XML <a href="#EX-provenance-global-1" shape="rect">Example 58</a>. In HTML such standoff markup is placed into a <code>script</code> element. If this is done, the constraints for <a href="#provenance-records-in-html5-constraint" shape="rect">Provenance standoff</a> markup in HTML and <a href="#loc-quality-issues-in-html5-constraint" shape="rect">Localization quality issue</a> markup in HTML need to be taken into account. Examples of standoff markup in HTML for the two data categories are <a href="#EX-proveance-html5-local-2" shape="rect">Example 61</a> and <a href="#EX-locQualityIssue-html5-local-2" shape="rect">Example 76</a>.</p></div><div class="div3"> +<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="usage-in-legacy-html" id="usage-in-legacy-html" shape="rect"/>2.5.5 Version of HTML</h4><p>ITS 2.0 does not define how to use ITS in HTML versions prior to version 5. Users are + thus encouraged to migrate their content to <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> or XHTML. While it is possible to use <code>its-*</code> attributes introduced for <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> in older versions of HTML (such as 3.2 or 4.01) and pages using these attributes will work without any problems, - <code>its-*</code> attributes will be marked as invalid in validators.</p></div></div><div class="div2"> -<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="traceability" id="traceability" shape="rect"/>2.6 Traceability</h3><p>The <a href="#its-tool-annotation" shape="rect">ITS Tools Annotation</a> mechanism allows to associate processor information with the use of individual data categories in a document, independently from data category annotations themselves. The mechanism associates identifiers for tools and data categories via the <code class="its-attr-markup">annotatorsRef</code> attribute (or <code class="its-attr-markup">annotators-ref</code> in <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>) and is mandatory for the <a href="#mtconfidence" shape="rect">MT Confidence</a> data category. For the <a href="#terminology" shape="rect">Terminology</a> and <a href="#textanalysis" shape="rect">Text Analysis</a> data categories it is mandatory if they provide confience information, that is always tool related. Nevertheless, <a href="#its-tool-annotation" shape="rect">ITS Tools Annotation</a> can be used for all data categories. <a href="#EX-its-tool-annotation-2" shape="rect">Example 23</a> demonstrates the usage including several data categories.</p></div><div class="div2"> + <code>its-*</code> attributes will be marked as invalid by validators.</p></div></div><div class="div2"> +<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="traceability" id="traceability" shape="rect"/>2.6 Traceability</h3><p>The <a href="#its-tool-annotation" shape="rect">ITS Tools Annotation</a> mechanism allows associating processor information with individual data categories in a document, independently from data category annotations themselves (e.g. the Entity Type related to Text Analysis). The mechanism associates identifiers for tools with data categories via the <code class="its-attr-markup">annotatorsRef</code> attribute (or <a href="" shape="rect">annotators-ref</a> in <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>) and is mandatory for the <a href="#mtconfidence" shape="rect">MT Confidence</a> data category. For the <a href="#terminology" shape="rect">Terminology</a> and <a href="#textanalysis" shape="rect">Text Analysis</a> data categories the ITS Tols Annotation is mandatory if the data categories provide confidence information. Nevertheless, <a href="#its-tool-annotation" shape="rect">ITS Tools Annotation</a> can be used for all data categories. <a href="#EX-its-tool-annotation-2" shape="rect">Example 23</a> demonstrates the usage in the context of several data categories. + </p></div><div class="div2"> <h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="mapping-conversion" id="mapping-conversion" shape="rect"/>2.7 Mapping and conversion</h3><div class="div3"> <h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="mapping-NIF" id="mapping-NIF" shape="rect"/>2.7.1 ITS and RDF/NIF</h4><p>ITS 2.0 defines an algorithm to convert XML or HTML documents (or their DOM - representations) that contain ITS metadata to the RDF-based format based on <a title="" href="#nif-reference" shape="rect">[NIF]</a>. NIF is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations.</p><p>The conversion <a href="#conversion-to-nif" shape="rect">ITS 2.0 to NIF</a> results in RDF triples that represent the textual content of the original document as RDF typed information and the ITS annotation as properties of those nodes defined in an <a href="http://www.w3.org/2005/11/its/rdf#" shape="rect">ITS RDF vocabulary</a>.</p><p>The backconversion <a href="#nif-backconversion" shape="rect">NIF to ITS 2.0</a> is defined informatively; it exemplifies a roundtripping involving automatic enrichment of HTML documents with linked information.</p></div><div class="div3"> -<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="mapping-XLIFF" id="mapping-XLIFF" shape="rect"/>2.7.2 ITS and XLIFF</h4><p>The XML Localization Interchange File Format <a title="XLIFF Version 1.2" href="#xliff1.2" shape="rect">[XLIFF 1.2]</a> is an OASIS standard that enables translatable source text and its translation to be passed between different tools within localisation and translation workflows. <a title="XLIFF Version 2.0" href="#xliff2.0" shape="rect">[XLIFF 2.0]</a> is the successor of <a title="XLIFF Version 1.2" href="#xliff1.2" shape="rect">[XLIFF 1.2]</a> and under development. <a title="XLIFF Version 1.2" href="#xliff1.2" shape="rect">[XLIFF 1.2]</a> has been widely implemented in translation management systems, computer supported translation tools and in utilities for extracting translatable content from source documents. The mapping between ITS andXLIFF therefore unpins several important ITS 2.0 usage scenarios <a title="Metadata for the Multilingual Web - Usage Scenarios and Implementations " href="#mlw-metadata-us-impl" shape="rect">[MLW US IMPL]</a>. These usage scenarios involve: 1) the extraction of ITS meta-data from a source language file into XLIFF; 2) the addition of ITS meta-data into an XLIFF file by translation tools; and 3) the mapping of ITS meta-data in an XLIFF file into ITS meta-data in the resulting target language files. ITS 2.0 has no normative dependency on XLIFF, however a <a href="http://www.w3.org/International/its/wiki/XLIFF_Mapping" shape="rect">non-normative definition of how to represent ITS 2.0 data categories in XLIFF 1.2 or XLIFF 2.0</a> is being defined within the <a href="http://www.w3.org/International/its/ig/" shape="rect">Internationalization Tag Set Interest Group</a>.</p></div></div><div class="div2"> -<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="datacategories-summary" id="datacategories-summary" shape="rect"/>2.8 Summary: ITS 2.0 data categories</h3><p>ITS 2.0 provides the following data categories, using most of the existing ITS 1.0 data categories and adding new ones. Modifications of existing ITS 1.0 data categories are summarized in <a class="section-ref" href="#high-level-differences-between-1.0-and-2.0" shape="rect">Section 1.4: High-level differences between ITS 1.0 and ITS 2.0</a>.</p><ul><li><p><a href="#trans-datacat" shape="rect">Translate</a>: express information about whether a selected piece of content should be translated or not.</p></li><li><p><a href="#locNote-datacat" shape="rect">Localization Note</a>: communicate notes to localizers about a particular item of content.</p></li><li><p><a href="#terminology" shape="rect">Terminology</a>: mak terms and optionally associate them with information, such as definitions or references to a term data base.</p></li><li><p><a href="#directionality" shape="rect">Directionality</a>: specify the base writing direction of blocks, embeddings and overrides for the Unicode bidirectional algorithm.</p></li><li><p><a href="#language-information" shape="rect">Language Information</a>: express the language of a given piece of content.</p></li><li><p><a href="#elements-within-text" shape="rect">Elements Witin Text:</a> express how content of an element is related to the text flow (constitute its own segment like paragraphs, be part of a segment like emphasis marker etc).</p></li><li><p><a href="#domain" shape="rect">Domain</a>: identify the topic or subject of the annotated content for translation related applications.</p></li><li><p><a href="#textanalysis" shape="rect">Text Analysis</a>: annotate content with lexical or conceptual information for the purpose of contextual disambiguation.</p></li><li><p><a href="#ocaleFilter" shape="rect">Locale Filter</a>: specify that a piece of content is only applicable to certain locales. </p></li><li><p><a href="#provenance" shape="rect">Provenance</a>: communicate the identity of agents that have been involved in the translation of the content or the revision of the translated content.</p></li><li><p><a href="#externalresource" shape="rect">External Resource</a>: indicate that a reference points to potentially translatable data in a resource outside the document. Examples of such resources are external images and audio or video files.</p></li><li><p><a href="#target-pointer" shape="rect">Target Pointer</a>: associate a given piece of source content (i.e. the content to be translated) and its corresponding target content (i.e. the source content translated into a given target language).</p></li><li><p><a href="#idvalue" shape="rect">Id Value</a>: identify a value that can be used as unique identifier for a given part of the content. - </p></li><li><p><a href="#preservespace" shape="rect">Preserve Space</a>: indicate how whitespace should be handled in content.</p></li><li><p><a href="#lqissue" shape="rect">Localization Quality Issue</a>: describe the nature and severity of an error detected during a language-oriented quality assurance (QA) process.</p></li><li><p><a href="#lqrating" shape="rect">Localization Quality Rating</a>: express an overall measurement of the localization quality of a document or an item in a document.</p></li><li><p><a href="#mtconfidence" shape="rect">MT Confidence</a>: indicate the confidence that MT systems provide about their translation. - </p></li><li><p><a href="#allowedchars" shape="rect">Allowed Characters</a>: specify the characters that are permitted in a given piece of content.</p></li><li><p><a href="#storagesize" shape="rect">Storage Size</a>: specify the maximum storage size of a given content.</p></li></ul></div><div class="div2"> -<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="implementing-its20" id="implementing-its20" shape="rect"/>2.9 Implementing ITS 2.0</h3><p>What does it mean to implement ITS 2.0? This specification provides several conformance clauses as the normative answer, see <a class="section-ref" href="#conformance" shape="rect">Section 4: Conformance</a>, targeted at different types of implementers.</p><ul><li><p>Conformance clauses in <a class="section-ref" href="#conformance-product-schema" shape="rect">Section 4.1: Conformance Type 1: ITS Markup Declarations</a> tell markup vocabulary developers how to add ITS 2.0 markup declarations to their schemas.</p></li><li><p>Conformance clauses in <a class="section-ref" href="#conformance-product-processing-expectations" shape="rect">Section 4.2: Conformance Type 2: The Processing Expectations for ITS Markup</a> tell implementer how to process XML content applying ITS 2.0 data categories.</p></li><li><p>Conformance clauses in <a class="section-ref" href="#conformance-product-html-processing-expectations" shape="rect">Section 4.3: Conformance Type 3: Processing Expectations for ITS Markup in HTML</a> tell implementers how to process <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> content.</p></li><li><p>Conformance clauses in <a class="section-ref" href="#conformance-class-html5-its" shape="rect">Section 4.4: Conformance Class for HTML5+ITS documents</a> tell implementers how ITS 2.0 markup is integrated into <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>.</p></li></ul><p>The conformance clauses in <a class="section-ref" href="#conformance-product-processing-expectations" shape="rect">Section 4.2: Conformance Type 2: The Processing Expectations for ITS Markup</a> and <a class="section-ref" href="#conformance-product-html-processing-expectations" shape="rect">Section 4.3: Conformance Type 3: Processing Expectatins for ITS Markup in HTML</a> make clear: what information needs to be made available for given pieces of markup then processing a dedicated ITS 2.0 data category? To allow for flexibility, an implementation can choose whether it wants to process only ITS 2.0 global or local information, or XML or HTML content. These choices are reflected in seperate conformances clauses and also in the <a href="@@@@" shape="rect">ITS 2.0 test suite</a>.</p><p>ITS 2.0 processing expectations only cover above aspect, that is: what information needs to be made available. They do not define how that information actually should be applied. This is due to the fact that there is a huge variety of usage scenarios of ITS 2.0, and a huge variety of tools for working with ITS 2.0. Each of these tools has their own way of using ITS 2.0 data categories. See <a title="Metadata for the Multilingual Web - Usage Scenarios and Implementations " href="#mlw-metadata-us-impl" shape="rect">[MLW US IMPL]</a> for more information.</p><span class=editor-note">[Ed. note: Add link to test suite]</span></div></div><div class="div1"> + representations) that contain ITS metadata to the RDF-based format based on <a title="" href="#nif-reference" shape="rect">[NIF]</a>. NIF is an RDF/OWL-based format that aims at interoperability between Natural Language Processing (NLP) tools, language resources and annotations.</p><p>The conversion <a href="#conversion-to-nif" shape="rect">ITS 2.0 to NIF</a> results in RDF triples. These triples represent the textual content of the original document as RDF typed information. The ITS annotation is represented as properties of content related triples and relies on an <a href="http://www.w3.org/2005/11/its/rdf#" shape="rect">ITS RDF vocabulary</a>.</p><p>The back conversion from <a href="#nif-backconversion" shape="rect">NIF to ITS 2.0</a> is defined informatively. One motivation for the back conversion is a round tripping work flow like: 1) conversion to NIF 2) in NIF representation detection of named entities using NLP tools 3) back conversion to HTML and generation of <a href="#textanalysis" shape"rect">Text Analysis</a> markup. The outcome are HTML documents with linked information, see <a href="#EX-text-analysis-html5-local-1" shape="rect">Example 52</a>.</p></div><div class="div3"> +<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="mapping-XLIFF" id="mapping-XLIFF" shape="rect"/>2.7.2 ITS and XLIFF</h4><p>The XML Localization Interchange File Format <a title="XLIFF Version 1.2" href="#xliff1.2" shape="rect">[XLIFF 1.2]</a> is an OASIS standard that enables translatable source text and its translation to be passed between different tools within localization and translation workflows. <a title="XLIFF Version 2.0" href="#xliff2.0" shape="rect">[XLIFF 2.0]</a> is the successor of <a title="XLIFF Version 1.2" href="#xliff1.2" shape="rect">[XLIFF 1.2]</a> and under development. XLIFF has been widely implemented in various translation management systems, computer supported translation tools and in utilities for extracting translatable content from source documents.</p><p>The mapping between ITS and XLIFF therefore unpins several important ITS 2.0 usagescenarios <a title="Metadata for the Multilingual Web - Usage Scenarios and Implementations " href="#mlw-metadata-us-impl" shape="rect">[MLW US IMPL]</a>. These usage scenarios involve:</p><ul><li><p>the extraction of ITS meta-data from a source language file into XLIFF</p></li><li><p>the addition of ITS meta-data into an XLIFF file by translation tools</p></li><li><p>the mapping of ITS meta-data in an XLIFF file into ITS meta-data in the resulting target language files.</p></li></ul><p>ITS 2.0 has no normative dependency on XLIFF, however a <a href="http://www.w3.org/International/its/wiki/XLIFF_Mapping" shape="rect">non-normative definition of how to represent ITS 2.0 data categories in XLIFF 1.2 or XLIFF 2.0</a> is being defined within the <a href="http://www.w3.org/International/its/ig/" shape="rect">Internationalization Tag Set Interest Group</a>.</p></div></div><div class="div2"> +<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="implementing-its20" id="implementing-its20" shape="rect"/>2.8 ITS 2.0 Implementations and Conformance</h3><p>What does it mean to implement ITS 2.0? This specification provides several conformance clauses as the normative answer (see <a class="section-ref" href="#conformance" shape="rect">Section 4: Conformance</a>). The clauses are targeted at different types of implementers.</p><ul><li><p>Conformance clauses in <a class="section-ref" href="#conformance-product-schema" shape="rect">Section 4.1: Conformance Type 1: ITS Markup Declarations</a> tell markup vocabulary developers how to add ITS 2.0 markup declarations to their schemas.</p></li><li><p>Conformance clauses in <a class="section-ref" href="#conformance-product-processing-expectations" shape="rect">Section 4.2: Conformance Type 2: The Processing Expectationsfor ITS Markup</a> tell implementers how to process XML content according to ITS 2.0 data categories.</p></li><li><p>Conformance clauses in <a class="section-ref" href="#conformance-product-html-processing-expectations" shape="rect">Section 4.3: Conformance Type 3: Processing Expectations for ITS Markup in HTML</a> tell implementers how to process <a title="HTML5" href="#html5" shape="rect">[HTML5]</a> content.</p></li><li><p>Conformance clauses in <a class="section-ref" href="#conformance-class-html5-its" shape="rect">Section 4.4: Conformance Class for HTML5+ITS documents</a> tell implementers how ITS 2.0 markup is integrated into <a title="HTML5" href="#html5" shape="rect">[HTML5]</a>.</p></li></ul><p>The conformance clauses in <a class="section-ref" href="#conformance-product-processing-expectations" shape="rect">Section 4.2: Conformance Type 2: The Processing Expectations for ITS Markup</a> and <a class="section-ref" href="#conformance-product-html-processing-expectations" shape="rect">Section 4.3: onformance Type 3: Processing Expectations for ITS Markup in HTML</a> clarify how information needs to be made available for given pieces of markup when processing a dedicated ITS 2.0 data category. To allow for flexibility, an implementation can choose whether it wants to support only ITS 2.0 global or local information, or XML or HTML content. These choices are reflected in separate conformance clauses and also in the <a href="@@@@" shape="rect">ITS 2.0 test suite</a>.</p><p>ITS 2.0 processing expectations only define which information needs to be made available. They do not define how that information actually should be used. This is due to the fact that there is a wide variety of usage scenarios for ITS 2.0, and a wide variety of tools for working with ITS 2.0 is possible. Each of these tools may have its own way of using ITS 2.0 data categories (see <a title="Metadata for the Multilingual Web - Usage Scenarios and Implementations " href="#mlw-metadata-us-impl" shape="rect">[MLW US IMPL]</a> for more inormation).</p><span class="editor-note">[Ed. note: Add link to test suite]</span></div></div><div class="div1"> <h2><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="notation-terminology" id="notation-terminology" shape="rect"/>3 Notation and Terminology</h2><p> <em>This section is normative.</em> </p><div class="div2"> @@ -568,7 +549,7 @@ and localization of XML schemas and documents.] The concept of a data category is independent of its implementation in an XML and HTML environment (e.g. using an element or attribute).</p><p>For each data category, ITS distinguishes between the following:</p><ul><li><p>the prose description, see <a class="section-ref" href="#datacategory-description" shape="rect">Section 8: Description of Data Categories</a></p></li><li><p>schema language independent formalization, see the "implementation" subsections in - <a class="section-ref" href="#datacategory-description" shape="rect">Section 8: Description of Data Categories</a></p></li><li><p>schema language specific implementations, see <a class="section-ref" href="#its-schemas" shape="rect">Appendix D: Schemas for ITS</a></p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="d0e1546" id="d0e1546" shape="rect"/>Example 9: A data category and its implementation</div><p>The <a href="#trans-datacat" shape="rect">Translate</a> data category conveys information as + <a class="section-ref" href="#datacategory-description" shape="rect">Section 8: Description of Data Categories</a></p></li><li><p>schema language specific implementations, see <a class="section-ref" href="#its-schemas" shape="rect">Appendix D: Schemas for ITS</a></p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="d0e1616" id="d0e1616" shape="rect"/>Example 9: A data category and its implementation</div><p>The <a href="#trans-datacat" shape="rect">Translate</a> data category conveys information as to whether a piece of content should be translated or not.</p><p>The simplest formalization of this prose description on a schema language independent level is a <code class="its-attr-markup">translate</code> attribute with two possible values: "yes" and "no". An implementation on a schema language specific @@ -833,9 +814,9 @@ actual query language. The query language is set by <code class="its-attr-markup">queryLanguage</code> attribute on <code class="its-elem-markup">rules</code> element. If <code class="its-attr-markup">queryLanguge</code> is not specified XPath 1.0 is used as a default query language.</p></div><div class="div3"> -<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2457" id="d0e2457" shape="rect"/>5.3.2 XPath 1.0</h4><p>XPath 1.0 is identified by <code>xpath</code> value in <code class="its-attr-markup">queryLanguage</code> +<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2527" id="d0e2527" shape="rect"/>5.3.2 XPath 1.0</h4><p>XPath 1.0 is identified by <code>xpath</code> value in <code class="its-attr-markup">queryLanguage</code> attribute.</p><div class="div4"> -<h5><a name="d0e2468" id="d0e2468" shape="rect"/>5.3.2.1 Absolute selector</h5><p>The absolute selector <a href="#rfc-keywords" shape="rect">MUST</a> be an XPath expression +<h5><a name="d0e2538" id="d0e2538" shape="rect"/>5.3.2.1 Absolute selector</h5><p>The absolute selector <a href="#rfc-keywords" shape="rect">MUST</a> be an XPath expression which starts with "<code>/</code>". That is, it must be an <a href="http://www.w3.org/TR/xpath/#NT-AbsoluteLocationPath" shape="rect"> AbsoluteLocationPath</a> or union of <a href="http://www.w3.org/TR/xpath/#NT-AbsoluteLocationPath" shape="rect"> AbsoluteLocationPath</a>s as described in <a href="#xpath" shape="rect">XPath 1.0</a>. @@ -880,14 +861,14 @@ implementations can be used.</p></div><div class="note"><p class="prefix"><b>Note:</b></p><p id="css-selectors-and-attributes">CSS selectors have no ability to point to attributes.</p></div><p>CSS Selectors are identified by <code>css</code> value in <code class="its-attr-markup">queryLanguage</code> attribute.</p><div class="div4"> -<h5><a name="d0e2679" id="d0e2679" shape="rect"/>5.3.3.1 Absolute selector</h5><p>Absolute selector <a href="#rfc-keywords" shape="rect">MUST</a> be interpreted as selector +<h5><a name="d0e2749" id="d0e2749" shape="rect"/>5.3.3.1 Absolute selector</h5><p>Absolute selector <a href="#rfc-keywords" shape="rect">MUST</a> be interpreted as selector as defined in <a title="Selectors Level
 3" href="#css3-selectors" shape="rect">[Selectors Level 3]</a>. Both simple selectors and groups of selectors can be used.</p></div><div class="div4"> -<h5><a name="d0e2689" id="d0e2689" shape="rect"/>5.3.3.2 Relative selector</h5><p>Relative selector <a href="#rfc-keywords" shape="rect">MUST</a> be interpreted as selector +<h5><a name="d0e2759" id="d0e2759" shape="rect"/>5.3.3.2 Relative selector</h5><p>Relative selector <a href="#rfc-keywords" shape="rect">MUST</a> be interpreted as selector as defined in <a title="Selectors Level
 3" href="#css3-selectors" shape="rect">[Selectors Level 3]</a>. Selector is not evaluated against the complete document tree but only against subtrees rooted at nodes selected by selector in the <code class="its-attr-markup">selector</code> attribute.</p></div></div><div class="div3"> -<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2702" id="d0e2702" shape="rect"/>5.3.4 Additional query languages</h4><p>ITS processors <a href="#rfc-keywords" shape="rect">MAY</a> support additional query +<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2772" id="d0e2772" shape="rect"/>5.3.4 Additional query languages</h4><p>ITS processors <a href="#rfc-keywords" shape="rect">MAY</a> support additional query languages. For each additional query language the processor <a href="#rfc-keywords" shape="rect">MUST</a> define:</p><ul><li><p>identifier of query language used in <code class="its-attr-markup">queryLanguage</code>;</p></li><li><p>rules for evaluating absolute selector to collection of nodes;</p></li><li><p>rules for evaluating relative selector to collection of nodes.</p></li></ul><p>Because future versions of this specification are likely to define additional query languages, the following query language identifiers are reserved: <code>xpath</code>, <code>css</code>, <code>xpath2</code>, <code>xpath3</code>, <code>xquery</code>, --- /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20-for-editing-sec1-sec2.odd 2013/06/11 08:26:03 1.21 +++ /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20-for-editing-sec1-sec2.odd 2013/06/11 21:55:27 1.22 @@ -455,7 +455,7 @@ <p>The differences between ITS 1.0 and ITS 2.0 can be summarized as follows.</p> <p><emph>Coverage of <ptr type="bibref" target="#html5"/>: </emph>ITS 1.0 can be applied to XML content. ITS 2.0 extends the coverage to <ptr target="#html5" type="bibref"/>. Explanatory details about ITS 2.0 and <ptr target="#html5" type="bibref"/> are given in <ptr target="#specific-HTML-support" type="specref"/>.</p> - <p><emph>Addition of data categories</emph>: ITS 2.0 provides additional data categories and modifies existing ones. A summary of all ITS 2.0 data categories are given in <ptr target="#datacategories-summary" type="specref"/>.</p> + <p><emph>Addition of data categories</emph>: ITS 2.0 provides additional data categories and modifies existing ones. A summary of all ITS 2.0 data categories are given in <ptr target="#basic-concepts-datacategories" type="specref"/>.</p> <p><emph>Modification of data categories</emph>:</p> <list><item> <p xml:id="ruby-in-its2">ITS 1.0 provided the <ref target="http://www.w3.org/TR/2007/REC-its-20070403/#ruby-annotation">Ruby data category</ref>. ITS 2.0 does not provide ruby since at the time of writing, because of the the <ref target="http://www.w3.org/TR/html51/text-level-semantics.html#the-ruby-element">ruby model in HTML5</ref> was still under development. Once these discussions are settled, the Ruby data category possibly will be re-introduced, in a subsequent version of ITS.</p></item> @@ -496,6 +496,62 @@ <item>for several pieces of content in one document or even a set of documents. This is the so-called <ref target="#basic-concepts-selection-global">global approach</ref>.</item> <item>For a complete markup vocabulary. This is done by adding <ref target="#its-schemas">ITS markup declarations</ref> to the schema for the vocabulary.</item> </list> + + + <p>ITS 2.0 provides the following data categories, using most of the existing ITS 1.0 data categories and adding new ones. Modifications of existing ITS 1.0 data categories are summarized in <ptr target="#high-level-differences-between-1.0-and-2.0" type="specref"/>.</p> + + + <list type="unordered"> + <item><ref target="#trans-datacat">Translate</ref>: express information about whether a selected piece of content should be translated or not.</item> + <item><ref target="#locNote-datacat">Localization Note</ref>: communicate notes to localizers about a particular item of content.</item> + <item><ref target="#terminology">Terminology</ref>: mark terms and optionally associate them with information, such as definitions or references to a term data base.</item> + <item><ref target="#directionality">Directionality</ref>: specify the base writing direction of blocks, embeddings and overrides for the Unicode bidirectional algorithm.</item> + <item><ref target="#language-information">Language Information</ref>: express the language of a given piece of content.</item> + <item><ref target="#elements-within-text">Elements Witin Text:</ref> express how content of an element is related to the text flow (constitute its own segment like paragraphs, be part of a segment like emphasis marker etc.).</item> + <item><ref target="#domain">Domain</ref>: identify the topic or subject of the annotated content for translation-related applications.</item> + + + + <item><ref target="#textanalysis">Text Analysis</ref>: annotate content with lexical or conceptual information (e.g. for the purpose of contextual disambiguation).</item> + + + + <item><ref target="#LocaleFilter">Locale Filter</ref>: specify that a piece of content is only applicable to certain locales. </item> + + + + <item><ref target="#provenance">Provenance</ref>: communicate the identity of agents that have been involved processing content.</item> + + + + <item><ref target="#externalresource">External Resource</ref>: indicate reference points in a resource outside the document that need to be considered during localization or translation. Examples of such resources are external images and audio or video files.</item> + + + + <item><ref target="#target-pointer">Target Pointer</ref>: associate the markup node of a given source content (i.e. the content to be translated) and the markup node of its corresponding target content (i.e. the source content translated into a given target language). This is relevant for formats that hold the same content in different languages inside a single document.</item> + + + + <item><ref target="#idvalue">Id Value</ref>: identify a value that can be used as unique identifier for a given part of the content. + </item> + + + <item><ref target="#preservespace">Preserve Space</ref>: indicate how whitespace should be handled in content.</item> + + + <item><ref target="#lqissue">Localization Quality Issue</ref>: describe the nature and severity of an error detected during a language-oriented quality assurance (QA) process.</item> + + + <item><ref target="#lqrating">Localization Quality Rating</ref>: express an overall measurement of the localization quality of a document or an item in a document.</item> + + + <item><ref target="#mtconfidence">MT Confidence</ref>: indicate the confidence that MT systems provide about their translation. + </item> + + + <item> <ref target="#allowedchars">Allowed Characters</ref>: specify the characters that are permitted in a given piece of content.</item> + + <item><ref target="#storagesize">Storage Size</ref>: specify the maximum storage size of a given content.</item></list> </div> <div xml:id="basic-concepts-selection"> @@ -547,7 +603,7 @@ </exemplum> <p>For the global approach (and <ptr target="#EX-basic-concepts-2" type="exref"/>) to work, a schema developer may need to add a <gi>rules</gi> element and associated markup to the schema. In some cases, global rules may be sufficient and other ITS markup (such as an <att>translate</att> attribute on the elements and attributes) may not be needed in the schema. However, it is likely that authors may need the local approach from time to time to override the general rule.</p> - <p>For specification of the <ref target="#trans-datacat">Translate</ref> data category information, the contents of the <gi>translateRule</gi> element would normally be designed by an information architect familiar with the document format and familiar with, or working with someone familiar with, the needs of the localization/translation group.</p> + <p>For specification of the <ref target="#trans-datacat">Translate</ref> data category information, the contents of the <gi>translateRule</gi> element would normally be designed by an information architect familiar with the document format and familiar with, or working with someone familiar with, the needs of localization/translation.</p> <p>The global, rule-based approach has the following benefits:</p> <list> @@ -583,35 +639,33 @@ target="#selection-precedence">overriding/precedence</ref>, and <ref target="#datacategories-defaults-etc">inheritance</ref>, have to be established.</p> <p>The document in <ptr target="#EX-basic-concepts-3" type="exref"/> shows how inheritance - and overriding work for the <ref target="#trans-datacat">Translate</ref> data category. - By default elements are translatable. Here, the <gi>translateRule</gi> element declared - in the header overrides the default for the <code>head</code> element inside - <code>text</code> and for all its children. Because the <code>title</code> element is - actually translatable, the global rule needs to be overridden by a local - <code>its:translate="yes"</code>. Note that the global rule is processed first, - regardless of its position inside the document. In the main body of the document, the - default applies, and here it is <code>its:translate="no"</code> that is used to set - “faux pas” as non-translatable.</p> + and overriding work for the <ref target="#trans-datacat">Translate</ref> data category:</p> + <list type="unordered"> + <item>The ITS default is that all elements are translatable.</item> + <item>The <gi>translateRule</gi> element declared in the header overrides the default for the <code>head</code> element inside text and for all its children.</item> + <item>Because the <code>title</code> element is actually translatable, the global rule needs to be overridden by a local <code>its:translate="yes"</code>.</item> + <item>In the body of the document the default applies, and <code>its:translate="no"</code> is used to set <val>faux pas</val> as non-translatable.</item> + </list> <exemplum xml:id="EX-basic-concepts-3"> <head>Overriding and Inheritance</head> <egXML xmlns="http://www.tei-c.org/ns/Examples" target="examples/xml/EX-basic-concepts-3.xml"/> </exemplum> - <p>For XML content, <ref target="#datacategories-overview">data category specific defaults</ref> are provided. These are independent of the actual XML markup vocabulary. For <ptr target="#html5" type="bibref"/>, several HTML5 elements and attributes map exactly to ITS 2.0 data categories. Hence, that HTML markup is normatively interpreted as ITS 2.0 data category information: See <ptr target="#html5-existing-markup-versus-its" type="specref"/> for more information.</p> + <p>For XML content, <ref target="#datacategories-overview">data category specific defaults</ref> are provided. These are independent of the actual XML markup vocabulary. Example for the <ref target="#trans-datacat">Translate</ref> data category: <code>translate="yes"</code> for elements, and <code>translate="no"</code> for attributes.</p> + <p>For <ptr target="#html5" type="bibref"/>, several HTML5 elements and attributes map exactly to ITS 2.0 data categories. Hence, that HTML markup is normatively interpreted as ITS 2.0 data category information (see <ptr target="#html5-existing-markup-versus-its" type="specref"/> for more information).</p> </div> <div xml:id="basic-concepts-addingpointing"> <head>Adding Information or Pointing to Existing Information</head> - <p>For some data categories, special attributes add or point to information about the - selected nodes. For example, the <ref target="#locNote-datacat">Localization Note</ref> + <p>Data categories can add information or point to information for the selected nodes. For example, the <ref target="#locNote-datacat">Localization Note</ref> data category can add information to selected nodes (using a <gi>locNote</gi> element), or point to existing information elsewhere in the document (using a <att>locNotePointer</att> attribute).</p> <p>The <ref target="#datacategories-overview">data category overview table</ref>, in <ptr - target="#datacategories-defaults-etc" type="specref"/>, provides an overview of what - data categories allow to point to existing information or to add information.</p> - <p>The functionalities of adding information and pointing to existing information are - <emph>mutually exclusive</emph>. That is to say, attributes for pointing and adding + target="#datacategories-defaults-etc" type="specref"/>, provides an overview of which + data categories allow to add information, and which allow to point to existing information.</p> + <p>Adding information and pointing to existing information are + <emph>mutually exclusive</emph>: attributes for adding information and attributes for pointing to the same information must not appear at the same rule element.</p> </div> @@ -619,18 +673,20 @@ <p>For applying ITS 2.0 data categories to HTML, five aspects must be considered:</p> <list type="ordered"> - <item>referencing global rules</item> - <item>specifities of inserting local ITS 2.0 data categories</item> - <item>relationship between HTML markup and data categories,</item> + <item>global approach</item> + <item>local approach</item> + <item>HTML markup with ITS 2.0 counterparts</item> <item>standoff markup in HTML5</item> - <item>HTML version.</item> + <item>HTML version</item> </list> <p>In the following sections these aspects are briefly discussed.</p> - <div xml:id="html5-reference-global-rules"><head>Referencing global rules</head> - <p>To account for the so-called “<ref target="#basic-concepts-selection-global">global - approach</ref>” in HTML, this specification (see <ptr target="#html5-global-rules" type="specref"/>) defines a link type for referring to external files - with global rules and an approach to have inline global rules in the HTML <code>script</code> element. - It is preferred to use external global rules linked via the <code>link</code> element than to have inline global rules in the HTML document.</p> + <div xml:id="html5-global-approach"><head>Global approach in HTML5</head> + <p>To account for the so-called <ref target="#basic-concepts-selection-global">global + approach</ref> in HTML, this specification (see <ptr target="#html5-global-rules" type="specref"/>) defines + </p> + <list><item>a link type for referring to external files with global rules from a <code>link</code> element</item> + <item>an approach to have inline global rules in the HTML <code>script</code> element.</item></list> + <p>It is preferred to use external global rules linked via the <code>link</code> element than to have inline global rules in the HTML document.</p> <exemplum xml:id="EX-translate-html5-global-1"> <head>Using ITS global rules in HTML</head> <p>The <code>link</code> element points to the rules file @@ -647,69 +703,61 @@ target="examples/html5/EX-translateRule-html5-1.xml"/> </exemplum> </div> - <div xml:id="html5-its-local-markup"><head>Specifities of inserting local ITS 2.0 data categories</head> - <p>In HTML, an ITS 2.0 local data category is realized with the specific prefix <code>its-*</code>. + <div xml:id="html5-its-local-markup"><head>Local approach</head> + <p>In HTML, an ITS 2.0 local data category is realized with the prefix <code>its-*</code>. The general mapping of the XML based ITS 2.0 attributes to their HTML <code>its-*</code> counterparts is defined in <ptr target="#html5-local-attributes" type="specref"/>. An informative table in <ptr target="#list-of-elements-and-attributes" type="specref"/> provides an overview of the mapping for all data categories.</p> </div> - <div xml:id="html5-existing-markup-versus-its"><head>Relation between HTML markup and ITS 2.0 data categories</head> - <p>There are four ITS 2.0 data categories, which have direct counterparts - in HTML markup. For theses data categories, ITS 2.0 defines the following specific - behaviour:</p> + <div xml:id="html5-existing-markup-versus-its"><head>HTML markup with ITS 2.0 counterparts</head> + <p>There are four ITS 2.0 data categories, which have counterparts in HTML markup. Put differently: native HTML markup reveals information for some ITS 2.0 data categories. For these data categories, ITS 2.0 defines the following:</p> <list type="unordered"> <item><p>The <ref target="#language-information">Language Information</ref> data category has the HTML <code>lang</code> - attribute counterpart; in XHTML this is the <code>xml:lang</code> attribute. These attributes act as + attribute as counterpart. In XHTML the counterpart is the <code>xml:lang</code> attribute. These attributes act as local markup for the <ref target="#language-information">Language Information</ref> data category in HTML and take <ref target="#selection-precedence">precedence</ref> over language information conveyed via a global <gi>langRule</gi>.</p></item> - <item><p>The <ref target="#idvalue">Id Value</ref> data category has the HTML or XHTML <code>id</code> attribute. - This attribute acts as local markup for the <ref target="#idvalue">Id Value</ref> data category in HTML and take <ref target="#selection-precedence">precedence</ref> over - id information conveyed via a global <gi>idValueRule</gi>.</p></item> + <item><p>The <ref target="#idvalue">Id Value</ref> data category has the HTML or XHTML <code>id</code> attribute as counterpart. + This attribute acts as local markup for the <ref target="#idvalue">Id Value</ref> data category in HTML and takes <ref target="#selection-precedence">precedence</ref> over + identifier information conveyed via a global <gi>idValueRule</gi>.</p></item> <item><p>The <ref target="#elements-within-text">Elements within Text</ref> data category has a set of HTML - elements defined as <ref target="http://www.w3.org/TR/html51/dom.html#phrasing-content-1">phrasing content</ref>. In the absence of an + elements (the so-called <ref target="http://www.w3.org/TR/html51/dom.html#phrasing-content-1">phrasing content</ref>) as counterpart. + In the absence of an <ref target="#elements-within-text">Elements within Text</ref> local attribute or global rules selecting the - element in question, these elements are always interpreted as - <code>withinText="yes"</code> by default, except for the elements <gi>iframe</gi>, <gi>noscript</gi>, <gi>script</gi> - and <gi>textarea</gi> which are interpreted as <code>withinText="nested"</code>.</p></item> + element in question, most of the phrasing content elements are interpreted as + <code>withinText="yes"</code> by default. The phrasing content elements <gi>iframe</gi>, <gi>noscript</gi>, <gi>script</gi> + and <gi>textarea</gi> are interpreted as <code>withinText="nested"</code>.</p></item> <item xml:id="translate-in-html5"><p>The <ref target="#trans-datacat">Translate</ref> data category has a direct counterpart in - <ptr target="#html5" type="bibref"/>, namely the HTML5 - <code>translate</code> attribute. ITS 2.0 does not define its own behaviour for HTML5 <code>translate</code>, but just refers to <ref target="http://www.w3.org/TR/html51/dom.html#the-translate-attribute">the HTML5 definition</ref>. The <ptr target="#html5" type="bibref"/> definition also applies to nodes selected via global rules. That is, a <gi>translateRule</gi> like <code><its:translateRule selector=""//h:img" translate="yes"/></code> will set the <code>img</code> element and its translatable attributes like <code>alt</code> to <val>yes</val>.</p></item> + <ptr target="#html5" type="bibref"/>, namely the <ptr target="#html5" type="bibref"/> + <code>translate</code> attribute. ITS 2.0 does not define its own behavior for <ptr target="#html5" type="bibref"/> <code>translate</code>, but just refers to <ref target="http://www.w3.org/TR/html51/dom.html#the-translate-attribute">the HTML5 definition</ref>. That definition also applies to nodes selected via global rules. That is, a <gi>translateRule</gi> like <code><its:translateRule selector=""//h:img" translate="yes"/></code> will set the <code>img</code> element and its translatable attributes like <code>alt</code> to <val>yes</val>.</p></item> </list> <exemplum xml:id="EX-its-and-existing-HTML5-markup"> <head>The <ref target="#language-information">Language Information</ref>, <ref target="#idvalue">Id Value</ref>, <ref target="#elements-within-text">Elements within Text</ref> and <ref target="#trans-datacat">Translate</ref> - ITS 2.0 data categories used with - HTML native markup.</head> - <p>The <code>html</code> element is interpreted to convey the + ITS 2.0 data categories expressed by native HTML markup.</head> + <p>The <code>lang</code> attribute of the <code>html</code> element conveys the <ref target="#language-information">Language Information</ref> value <val>en</val>. - The <code>p</code> element is interpreted to - convey the <ref target="#idvalue">Id Value</ref> of <val>p1</val>. The elements <code>em</code> and <code>img</code> are interpreted to be <code>withinText="yes"</code>. The <code>p</code> element and its children is set to be non-translatable via an <ptr target="#html5" type="bibref"/> <code>translate</code> attribute. Here the <code>alt</code> attribute, normally translatable by default, will also be non-translatable.</p> + The <code>id</code> attribute of the <code>p</code> element conveys the <ref target="#idvalue">Id Value</ref> <val>p1</val>. The elements <code>em</code> and <code>img</code> are interpreted to be <code>withinText="yes"</code>. The <code>p</code> element and its children are set to be non-translatable via an <ptr target="#html5" type="bibref"/> <code>translate</code> attribute. Via inheritance, the <code>alt</code> attribute, normally translatable by default, also is non-translatable.</p> <egXML xmlns="http://www.tei-c.org/ns/Examples" target="examples/html5/EX-its-and-existing-HTML5-markup.html"/> </exemplum> - <p>There are also some HTML markup elements that have similar, but not always identical, roles and behaviour than certain ITS 2.0 data categories. - For example, the HTML <code>dfn</code> element - could be used to identify a term in the sense of the <ref target="#terminology">Terminology</ref> data - category. However, this is not always the case and it depends on the - intentions of the content author. To accomodate this situation, users - of ITS 2.0 are encouraged to specifiy the association of existing HTML - markup with a dedicated global rules file. For an example rules file see the - <ref target="http://www.w3.org/TR/2008/NOTE-xml-i18n-bp-20080213/#relating-its-plus-xhtml">XML I18N Best Practices</ref> document.</p> + <p>There are also some HTML markup elements that have or can have similar, but not necessarily identical, roles and behavior than certain ITS 2.0 data categories. For example, the HTML <code>dfn</code> element could be used to identify a term in the sense of the <ref target="#terminology">Terminology</ref> data category. However, this is not always the case and it depends on the intentions of the HTML content author. To accommodate this situation, users of ITS 2.0 are encouraged to specify the semantics of existing HTML markup in an ITS 2.0 context with a dedicated global rules file. Example: use a rule to define that the HTML <code>dfn</code> has the semantics of ITS <code>term="yes</code>. For additional examples see the <ref target="http://www.w3.org/TR/2008/NOTE-xml-i18n-bp-20080213/#relating-its-plus-xhtml">XML I18N Best Practices</ref> document.</p> + </div> <div xml:id="html5-standoff-markup-explanation"><head>Standoff Markup in HTML5</head> - <p>The <ref target="#provenance">Provenance</ref> and the <ref target="#lqissue">Localization Quality Issue</ref> data categories allow for using standoff markup. In HTML such standoff markup is put into a <code>script</code> element. The constraints for <ref target="#provenance-records-in-html5-constraint">Provenance standoff</ref> markup in HTML and <ref target="#loc-quality-issues-in-html5-constraint">Localization quality issue</ref> markup in HTML need to be taken into account. Examples of standoff markup in HTML for the two data categories are <ptr target="#EX-provenance-html5-local-2" type="exref"/> and <ptr target="#EX-locQualityIssue-html5-local-2" type="exref"/>.</p></div> + <p>The <ref target="#provenance">Provenance</ref> and the <ref target="#lqissue">Localization Quality Issue</ref> data categories allow for using so-called standoff markup, see the XML <ptr target="#EX-provenance-global-1" type="exref"/>. In HTML such standoff markup is placed into a <code>script</code> element. If this is done, the constraints for <ref target="#provenance-records-in-html5-constraint">Provenance standoff</ref> markup in HTML and <ref target="#loc-quality-issues-in-html5-constraint">Localization quality issue</ref> markup in HTML need to be taken into account. Examples of standoff markup in HTML for the two data categories are <ptr target="#EX-provenance-html5-local-2" type="exref"/> and <ptr target="#EX-locQualityIssue-html5-local-2" type="exref"/>.</p></div> <div xml:id="usage-in-legacy-html"> <head>Version of HTML</head> - <p>ITS 2.0 does not define how to use ITS in HTML versions prior version 5. Users are - encouraged to migrate their content to HTML5 or XHTML. While it is possible to use + <p>ITS 2.0 does not define how to use ITS in HTML versions prior to version 5. Users are + thus encouraged to migrate their content to <ptr type="bibref" target="#html5"/> or XHTML. While it is possible to use <code>its-*</code> attributes introduced for <ptr target="#html5" type="bibref"/> in older versions of HTML (such as 3.2 or 4.01) and pages using these attributes will work without any problems, - <code>its-*</code> attributes will be marked as invalid in validators.</p> + <code>its-*</code> attributes will be marked as invalid by validators.</p> </div> </div> <div xml:id="traceability"><head>Traceability</head> - <p>The <ref target="#its-tool-annotation">ITS Tools Annotation</ref> mechanism allows to associate processor information with the use of individual data categories in a document, independently from data category annotations themselves. The mechanism associates identifiers for tools and data categories via the <att>annotatorsRef</att> attribute (or <att>annotators-ref</att> in <ptr target="#html5" type="bibref"/>) and is mandatory for the <ref target="#mtconfidence">MT Confidence</ref> data category. For the <ref target="#terminology">Terminology</ref> and <ref target="#textanalysis">Text Analysis</ref> data categories it is mandatory if they provide confidence information, that is always tool related. Nevertheless, <ref target="#its-tool-annotation">ITS Tools Annotation</ref> can be used for all data categories. <ptr target="#EX-its-tool-annotation-2" type="exref"/> demonstrates the usage including several data categories.</p></div> + <p>The <ref target="#its-tool-annotation">ITS Tools Annotation</ref> mechanism allows associating processor information with individual data categories in a document, independently from data category annotations themselves (e.g. the Entity Type related to Text Analysis). The mechanism associates identifiers for tools with data categories via the <att>annotatorsRef</att> attribute (or <ref>annotators-ref</ref> in <ptr type="bibref" target="#html5"/>) and is mandatory for the <ref target="#mtconfidence">MT Confidence</ref> data category. For the <ref target="#terminology">Terminology</ref> and <ref target="#textanalysis">Text Analysis</ref> data categories the ITS Tools Annotation is mandatory if the data categories provide confidence information. Nevertheless, <ref target="#its-tool-annotation">ITS Tools Annotation</ref> can be used for all data categories. <ptr target="#EX-its-tool-annotation-2" type="exref"/> demonstrates the usage in the context of several data categories. + </p></div> <div xml:id="mapping-conversion"> <head>Mapping and conversion</head> @@ -717,88 +765,36 @@ <div xml:id="mapping-NIF"><head>ITS and RDF/NIF</head> <p>ITS 2.0 defines an algorithm to convert XML or HTML documents (or their DOM representations) that contain ITS metadata to the RDF-based format based on <ptr - target="#nif-reference" type="bibref"/>. NIF is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations.</p> - <p>The conversion <ref target="#conversion-to-nif">ITS 2.0 to NIF</ref> results in RDF triples that represent the textual content of the original document as RDF typed information and the ITS annotation as properties of those nodes defined in an <ref target="http://www.w3.org/2005/11/its/rdf#">ITS RDF vocabulary</ref>.</p> + target="#nif-reference" type="bibref"/>. NIF is an RDF/OWL-based format that aims at interoperability between Natural Language Processing (NLP) tools, language resources and annotations.</p> + <p>The conversion <ref target="#conversion-to-nif">ITS 2.0 to NIF</ref> results in RDF triples. These triples represent the textual content of the original document as RDF typed information. The ITS annotation is represented as properties of content related triples and relies on an <ref target="http://www.w3.org/2005/11/its/rdf#">ITS RDF vocabulary</ref>.</p> - <p>The backconversion <ref target="#nif-backconversion">NIF to ITS 2.0</ref> is defined informatively; it exemplifies a roundtripping involving automatic enrichment of HTML documents with linked information.</p></div> + <p>The back conversion from <ref target="#nif-backconversion">NIF to ITS 2.0</ref> is defined informatively. One motivation for the back conversion is a round tripping work flow like: 1) conversion to NIF 2) in NIF representation detection of named entities using NLP tools 3) back conversion to HTML and generation of <ref target="#textanalysis">Text Analysis</ref> markup. The outcome are HTML documents with linked information, see <ptr target="#EX-text-analysis-html5-local-1" type="exref"/>.</p></div> <div xml:id="mapping-XLIFF"><head>ITS and XLIFF</head> - <p>The XML Localization Interchange File Format <ptr target="#xliff1.2" type="bibref"/> is an OASIS standard that enables translatable source text and its translation to be passed between different tools within localisation and translation workflows. <ptr target="#xliff2.0" type="bibref"/> is the successor of <ptr target="#xliff1.2" type="bibref"/> and under development. <ptr target="#xliff1.2" type="bibref"/> has been widely implemented in translation management systems, computer supported translation tools and in utilities for extracting translatable content from source documents. The mapping between ITS and XLIFF therefore unpins several important ITS 2.0 usage scenarios <ptr target="#mlw-metadata-us-impl" type="bibref"/>. These usage scenarios involve: 1) the extraction of ITS meta-data from a source language file into XLIFF; 2) the addition of ITS meta-data into an XLIFF file by translation tools; and 3) the mapping of ITS meta-data in an XLIFF file into ITS meta-data in the resulting target lanuage files. ITS 2.0 has no normative dependency on XLIFF, however a <ref target="http://www.w3.org/International/its/wiki/XLIFF_Mapping">non-normative definition of how to represent ITS 2.0 data categories in XLIFF 1.2 or XLIFF 2.0</ref> is being defined within the <ref target="http://www.w3.org/International/its/ig/">Internationalization Tag Set Interest Group</ref>.</p> + <p>The XML Localization Interchange File Format <ptr target="#xliff1.2" type="bibref"/> is an OASIS standard that enables translatable source text and its translation to be passed between different tools within localization and translation workflows. <ptr target="#xliff2.0" type="bibref"/> is the successor of <ptr target="#xliff1.2" type="bibref"/> and under development. XLIFF has been widely implemented in various translation management systems, computer supported translation tools and in utilities for extracting translatable content from source documents.</p> + + + <p>The mapping between ITS and XLIFF therefore unpins several important ITS 2.0 usage scenarios <ptr target="#mlw-metadata-us-impl" type="bibref"/>. These usage scenarios involve:</p> + <list type="unordered"> + <item>the extraction of ITS meta-data from a source language file into XLIFF</item> + <item>the addition of ITS meta-data into an XLIFF file by translation tools</item> + <item>the mapping of ITS meta-data in an XLIFF file into ITS meta-data in the resulting target language files.</item> </list> + <p>ITS 2.0 has no normative dependency on XLIFF, however a <ref target="http://www.w3.org/International/its/wiki/XLIFF_Mapping">non-normative definition of how to represent ITS 2.0 data categories in XLIFF 1.2 or XLIFF 2.0</ref> is being defined within the <ref target="http://www.w3.org/International/its/ig/">Internationalization Tag Set Interest Group</ref>.</p> </div> </div> - - - <div xml:id="datacategories-summary"><head>Summary: ITS 2.0 data categories</head> - - <p>ITS 2.0 provides the following data categories, using most of the existing ITS 1.0 data categories and adding new ones. Modifications of existing ITS 1.0 data categories are summarized in <ptr target="#high-level-differences-between-1.0-and-2.0" type="specref"/>.</p> - - - <list type="unordered"> - <item><ref target="#trans-datacat">Translate</ref>: express information about whether a selected piece of content should be translated or not.</item> - <item><ref target="#locNote-datacat">Localization Note</ref>: communicate notes to localizers about a particular item of content.</item> - <item><ref target="#terminology">Terminology</ref>: mark terms and optionally associate them with information, such as definitions or references to a term data base.</item> - <item><ref target="#directionality">Directionality</ref>: specify the base writing direction of blocks, embeddings and overrides for the Unicode bidirectional algorithm.</item> - <item><ref target="#language-information">Language Information</ref>: express the language of a given piece of content.</item> - <item><ref target="#elements-within-text">Elements Witin Text:</ref> express how content of an element is related to the text flow (constitute its own segment like paragraphs, be part of a segment like emphasis marker etc).</item> - <item><ref target="#domain">Domain</ref>: identify the topic or subject of the annotated content for translation related applications.</item> - - - - <item><ref target="#textanalysis">Text Analysis</ref>: annotate content with lexical or conceptual information for the purpose of contextual disambiguation.</item> - - - - <item><ref target="#LocaleFilter">Locale Filter</ref>: specify that a piece of content is only applicable to certain locales. </item> - - - - <item><ref target="#provenance">Provenance</ref>: communicate the identity of agents that have been involved in the translation of the content or the revision of the translated content.</item> - - - - <item><ref target="#externalresource">External Resource</ref>: indicate that a reference points to potentially translatable data in a resource outside the document. Examples of such resources are external images and audio or video files.</item> - - - - <item><ref target="#target-pointer">Target Pointer</ref>: associate a given piece of source content (i.e. the content to be translated) and its corresponding target content (i.e. the source content translated into a given target language).</item> - - - - <item><ref target="#idvalue">Id Value</ref>: identify a value that can be used as unique identifier for a given part of the content. - </item> - - - <item><ref target="#preservespace">Preserve Space</ref>: indicate how whitespace should be handled in content.</item> - - - <item><ref target="#lqissue">Localization Quality Issue</ref>: describe the nature and severity of an error detected during a language-oriented quality assurance (QA) process.</item> - - - <item><ref target="#lqrating">Localization Quality Rating</ref>: express an overall measurement of the localization quality of a document or an item in a document.</item> - - - <item><ref target="#mtconfidence">MT Confidence</ref>: indicate the confidence that MT systems provide about their translation. - </item> - - - <item> <ref target="#allowedchars">Allowed Characters</ref>: specify the characters that are permitted in a given piece of content.</item> - - <item><ref target="#storagesize">Storage Size</ref>: specify the maximum storage size of a given content.</item></list> - </div> - <div xml:id="implementing-its20"> - <head>Implementing ITS 2.0</head> - <p>What does it mean to implement ITS 2.0? This specification provides several conformance clauses as the normative answer, see <ptr type="specref" target="#conformance"/>, targeted at different types of implementers.</p> + <head>ITS 2.0 Implementations and Conformance</head> + <p>What does it mean to implement ITS 2.0? This specification provides several conformance clauses as the normative answer (see <ptr type="specref" target="#conformance"/>). The clauses are targeted at different types of implementers.</p> <list type="unordered"> <item>Conformance clauses in <ptr type="specref" target="#conformance-product-schema"/> tell markup vocabulary developers how to add ITS 2.0 markup declarations to their schemas.</item> - <item>Conformance clauses in <ptr target="#conformance-product-processing-expectations" type="specref"/> tell implementers how to process XML content applying ITS 2.0 data categories.</item> + <item>Conformance clauses in <ptr target="#conformance-product-processing-expectations" type="specref"/> tell implementers how to process XML content according to ITS 2.0 data categories.</item> <item>Conformance clauses in <ptr type="specref" target="#conformance-product-html-processing-expectations"/> tell implementers how to process <ptr target="#html5" type="bibref"/> content.</item> <item>Conformance clauses in <ptr target="#conformance-class-html5-its" type="specref"/> tell implementers how ITS 2.0 markup is integrated into <ptr type="bibref" target="#html5"/>.</item> </list> - <p>The conformance clauses in <ptr target="#conformance-product-processing-expectations" type="specref"/> and <ptr type="specref" target="#conformance-product-html-processing-expectations"/> make clear: what information needs to be made available for given pieces of markup then processing a dedicated ITS 2.0 data category? To allow for flexibility, an implementation can choose whether it wants to process only ITS 2.0 global or local information, or XML or HTML content. These choices are reflected in seperate conformances clauses and also in the <ref target="@@@@">ITS 2.0 test suite</ref>.</p> - <p>ITS 2.0 processing expectations only cover above aspect, that is: what information needs to be made available. They do not define how that information actually should be applied. This is due to the fact that there is a huge variety of usage scenarios of ITS 2.0, and a huge variety of tools for working with ITS 2.0. Each of these tools has their own way of using ITS 2.0 data categories. See <ptr type="bibref" target="#mlw-metadata-us-impl"/> for more information.</p> + <p>The conformance clauses in <ptr target="#conformance-product-processing-expectations" type="specref"/> and <ptr type="specref" target="#conformance-product-html-processing-expectations"/> clarify how information needs to be made available for given pieces of markup when processing a dedicated ITS 2.0 data category. To allow for flexibility, an implementation can choose whether it wants to support only ITS 2.0 global or local information, or XML or HTML content. These choices are reflected in separate conformance clauses and also in the <ref target="@@@@">ITS 2.0 test suite</ref>.</p> + <p>ITS 2.0 processing expectations only define which information needs to be made available. They do not define how that information actually should be used. This is due to the fact that there is a wide variety of usage scenarios for ITS 2.0, and a wide variety of tools for working with ITS 2.0 is possible. Each of these tools may have its own way of using ITS 2.0 data categories (see <ptr type="bibref" target="#mlw-metadata-us-impl"/> for more information).</p> <note type="ed">Add link to test suite</note> </div> </div>
Received on Tuesday, 11 June 2013 21:55:32 UTC