- From: CVS User fsasaki <cvsmail@w3.org>
- Date: Thu, 21 Mar 2013 13:53:59 +0000
- To: public-multilingualweb-lt-commits@w3.org
Update of /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/TR-version In directory gil:/tmp/cvs-serv17977/TR-version Modified Files: Overview.html Log Message: Proposal edits for issue-70, see http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Mar/0163.html --- /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/TR-version/Overview.html 2012/12/03 19:51:16 1.72 +++ /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/TR-version/Overview.html 2013/03/21 13:53:59 1.73 @@ -11,18 +11,21 @@ </dd><dt>Latest version:</dt><dd> <a href="http://www.w3.org/TR/its20/">http://www.w3.org/TR/its20/</a> </dd><dt>Previous version:</dt><dd><a href="http://www.w3.org/TR/2012/WD-its20-20121023/"> - http://www.w3.org/TR/2012/WD-its20-20121023/</a></dd><dt>Editors:</dt><dd>Shaun McCane, Invited Expert</dd><dd>Dave Lewis, TCD</dd><dd>Arle Lommel, DFKI</dd><dd>Jirka Kosek, UEP</dd><dd>Felix Sasaki, DFKI / W3C Fellow</dd><dd>Yves Savourel, ENLASO</dd></dl><p>This document is also available in these non-normative formats: <a href="its20.odd">ODD/XML document</a>, <a href="itstagset20.zip">self-contained zipped archive</a>, <a href="diffs/diff-wd20120626-its10-20070403.html">XHTML Diff markup between publication - 2012-06-26 and ITS 1.0 Recommendation 2007-04-03</a>, <a href="diffs/diff-wd20120731-wd20120626.html">XHTML Diff markup publication 2012-07-31 and - publication 2012-06-26</a>, <a href="diffs/diff-wd20120829-wd20120731.html">XHTML Diff markup publication 2012-08-29 and - publication 2012-07-31</a>, <a href="diffs/diff-wd20121023-wd20120829.html">XHTML Diff markup publication 2012-10-23 and - publication 2012-08-29</a>, and <a href="diffs/diff-wd20121206-wd20121023.html">XHTML Diff markup publication 2012-12-06 and - publication 2012-10-23</a>.</p><p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 2012 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.eu/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p></div><hr/><div> -<h2><a name="abstract" id="abstract"></a>Abstract</h2><p>This document defines data categories and their implementation as a set of elements and - attributes called the <em>Internationalization Tag Set (ITS)</em> 2.0. ITS 2.0 is the - successor of <a href="http://www.w3.org/TR/2007/REC-its-20070403/">ITS 1.0</a>; it is - designed to foster the creation of multilingual Web content, focusing on HTML, XML based - formats in general, and to leverage localization workflows based on the XML Localization - Interchange File Format (XLIFF).</p></div><div> + http://www.w3.org/TR/2012/WD-its20-20121023/</a></dd><dt>Editors:</dt><dd>Shaun McCane, Invited Expert</dd><dd>Dave Lewis, TCD</dd><dd>Arle Lommel, DFKI</dd><dd>Jirka Kosek, UEP</dd><dd>Felix Sasaki, DFKI / W3C Fellow</dd><dd>Yves Savourel, ENLASO</dd></dl><p>This document is also available in these non-normative formats: <a href="its20.odd">ODD/XML document</a>, <a href="itstagset20.zip">self-contained zipped archive</a>, and <a href="diffs/diff-wd20121206-wd20121023.html">XHTML Diff markup to previous publication 2012-10-23</a>.</p><p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 2012 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.eu/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.eio.ac.jp/">Keio</a>, <a href="http://ev.buaa.edu.cn/">Beihang</a>). All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p></div><hr/><div> +<h2><a name="abstract" id="abstract"></a>Abstract</h2><p> + The technology described in this document - the + <em>Internationalization Tag Set (ITS) 2.0</em> + - enhances the foundation to integrate automated processing of human + language into core Web technologies. ITS 2.0 bears many commonalities + with is predecessor, + <a href="http://www.w3.org/TR/2007/REC-its-20070403/">ITS 1.0</a> + but provides additional concepts that are designed to foster the + automated creation and processing of multilingual Web content. ITS + 2.0 focuses on HTML, XML-based formats in general, and can leverage + processing based on the XML Localization Interchange File Format + (XLIFF), as well as the Natural Language Processing Interchange + Format (NIF). + </p></div><div> <h2><a name="status" id="status"></a>Status of this Document</h2><p> <em>This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the @@ -90,9 +93,9 @@ <div class="toc3">5.2.2 <a href="#selection-local">Local Selection in an XML Document</a></div> </div> <div class="toc2">5.3 <a href="#selectors">Query Language of Selectors</a><div class="toc3">5.3.1 <a href="#queryLanguage">Choosing Query Language</a></div> -<div class="toc3">5.3.2 <a href="#d0e2071">XPath 1.0</a></div> -<div class="toc3">5.3.3 <a href="#d0e2265">CSS Selectors</a></div> -<div class="toc3">5.3.4 <a href="#d0e2304">Additional query languages</a></div> +<div class="toc3">5.3.2 <a href="#d0e2033">XPath 1.0</a></div> +<div class="toc3">5.3.3 <a href="#d0e2224">CSS Selectors</a></div> +<div class="toc3">5.3.4 <a href="#d0e2263">Additional query languages</a></div> <div class="toc3">5.3.5 <a href="#its-param">Variables in selectors</a></div> </div> <div class="toc2">5.4 <a href="#link-external-rules">Link to External Rules</a></div> @@ -175,11 +178,10 @@ <div class="toc1">C <a href="#lqissue-typevalues">Values for the Localization Quality Issue Type</a></div> <div class="toc1">D <a href="#its-schemas">Schemas for ITS</a></div> <div class="toc1">E <a href="#informative-references">References</a> (Non-Normative)</div> -<div class="toc1">F <a href="#its-schematron-constraints">Checking ITS Markup Constraints With Schematron</a> (Non-Normative)</div> -<div class="toc1">G <a href="#nif-backconversion">Conversion NIF2ITS</a> (Non-Normative)</div> -<div class="toc1">H <a href="#list-of-elements-and-attributes">List of ITS 2.0 Global Elements and Local Attributes</a> (Non-Normative)</div> -<div class="toc1">I <a href="#revisionlog">Revision Log</a> (Non-Normative)</div> -<div class="toc1">J <a href="#acknowledgements">Acknowledgements</a> (Non-Normative)</div> +<div class="toc1">F <a href="#nif-backconversion">Conversion NIF2ITS</a> (Non-Normative)</div> +<div class="toc1">G <a href="#list-of-elements-and-attributes">List of ITS 2.0 Global Elements and Local Attributes</a> (Non-Normative)</div> +<div class="toc1">H <a href="#revisionlog">Revision Log</a> (Non-Normative)</div> +<div class="toc1">I <a href="#acknowledgements">Acknowledgements</a> (Non-Normative)</div> </div><hr/><div class="body"><div class="div1"> <h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="introduction" id="introduction"></a>1 Introduction</h2><p> <em>This section is informative.</em> @@ -261,11 +263,11 @@ <strong class="hl-tag" style="color: #000096"><rsrc</strong> <span class="hl-attribute" style="color: #F5844C">id</span>=<span class="hl-value" style="color: #993300">"123"</span><strong class="hl-tag" style="color: #000096">></strong> <strong class="hl-tag" style="color: #000096"><component</strong> <span class="hl-attribute" style="color: #F5844C">id</span>=<span class="hl-value" style="color: #993300">"456"</span> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"image"</span><strong class="hl-tag" style="color: #000096">></strong> <strong class="hl-tag" style="color: #000096"><data</strong> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"text"</span><strong class="hl-tag" style="color: #000096">></strong>images/cancel.gif<strong class="hl-tag" style="color: #000096"></data></strong> - <strong class="hl-tag" style="color: #000096"><data</strong> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"coordinates"</span><strong class="hl-tag" style="color: #000096">></strong>12,20,50,14<strong class="hl-tag" style="color: #000096"></data></strong> + <strong class="hl-tag" style="color: #000096"><data</strong> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"position"</span><strong class="hl-tag" style="color: #000096">></strong>12,20<strong class="hl-tag" style="color: #000096"></data></strong> <strong class="hl-tag" style="color: #000096"></component></strong> <strong class="hl-tag" style="color: #000096"><component</strong> <span class="hl-attribute" style="color: #F5844C">id</span>=<span class="hl-value" style="color: #993300">"789"</span> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"caption"</span><strong class="hl-tag" style="color: #000096">></strong> <strong class="hl-tag" style="color: #000096"><data</strong> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"text"</span><strong class="hl-tag" style="color: #000096">></strong>Cancel<strong class="hl-tag" style="color: #000096"></data></strong> - <strong class="hl-tag" style="color: #000096"><data</strong> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"coordinates"</span><strong class="hl-tag" style="color: #000096">></strong>12,34,50,14<strong class="hl-tag" style="color: #000096"></data></strong> + <strong class="hl-tag" style="color: #000096"><data</strong> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"position"</span><strong class="hl-tag" style="color: #000096">></strong>60,40<strong class="hl-tag" style="color: #000096"></data></strong> <strong class="hl-tag" style="color: #000096"></component></strong> <strong class="hl-tag" style="color: #000096"><component</strong> <span class="hl-attribute" style="color: #F5844C">id</span>=<span class="hl-value" style="color: #993300">"792"</span> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"string"</span><strong class="hl-tag" style="color: #000096">></strong> <strong class="hl-tag" style="color: #000096"><data</strong> <span class="hl-attribute" style="color: #F5844C">type</span>=<span class="hl-value" style="color: #993300">"text"</span><strong class="hl-tag" style="color: #000096">></strong>Number of files: <strong class="hl-tag" style="color: #000096"></data></strong> @@ -281,7 +283,7 @@ these users, the information about what markup should be supported to enable worldwide use and effective localization of content is provided in this specification in two ways:</p><ul><li><p>abstractly in the data category descriptions: <a class="section-ref" href="#datacategory-description">Section 8: Description of Data Categories</a></p></li><li><p>concretely in the ITS schemas: <a class="section-ref" href="#its-schemas">Appendix D: Schemas for ITS</a></p></li></ul><div class="div4"> -<h5><a name="schema-dev-new" id="schema-dev-new"></a>1.3.1.1Schema developers starting a schema from the ground up</h5><p>This type of user will find proposals for attribute and element names to be +<h5><a name="schema-dev-new" id="schema-dev-new"></a>1.3.1.1 Schema developers starting a schema from the ground up</h5><p>This type of user will find proposals for attribute and element names to be included in their new schema (also called "host vocabulary"). Using the attribute and element names proposed in the ITS specification may be helpful because it leads to easier recognition of the concepts represented by both schema users and @@ -289,7 +291,7 @@ own set of attribute and element names. The specification sets out, first and foremost, to ensure that the required markup is available, and that the behavior of that markup meets established needs.</p></div><div class="div4"> -<h5><a name="schema-dev-existing" id="schema-dev-existing"></a>1.3.1.2Schema developers working with an existing schema</h5><p>This type of user will be working with schemas such as DocBook, DITA, or perhaps a +<h5><a name="schema-dev-existing" id="schema-dev-existing"></a>1.3.1.2 Schema developers working with an existing schema</h5><p>This type of user will be working with schemas such as DocBook, DITA, or perhaps a proprietary schema. The ITS Working Group has sought input from experts developing widely used formats such as the ones mentioned.</p><div class="note"><p class="prefix"><b>Note:</b></p><p>The question "How to use ITS with existing popular markup schemes?" is covered in more details (including examples) in a separate document: <a title="Best
 Practices for XML Internationalization" href="#xml-i18n-bp">[XML i18n BP]</a>.</p></div><p>Developers working on existing schemas should check whether their schemas support @@ -301,7 +303,7 @@ should, however, check that the behavior associated with the markup in their own schema is fully compatible with the expectations described in this specification.</p></div><div class="div4"> -<h5><a name="content-tool-vendor" id="content-tool-vendor"></a>1.3.1.3Vendors of content-related tools</h5><p>This type of user includes companies which provide tools for authoring, translation +<h5><a name="content-tool-vendor" id="content-tool-vendor"></a>1.3.1.3 Vendors of content-related tools</h5><p>This type of user includes companies which provide tools for authoring, translation or other flavors of content-related software solutions. It is important to ensure that such tools enable worldwide use and effective localization of content. For example, translation tools should prevent content marked up as not for translation @@ -309,7 +311,7 @@ the job of vendors easier by standardizing the format and processing expectations of certain relevant markup items, and allowing them to more effectively identify how content should be handled.</p></div><div class="div4"> -<h5><a name="content-producers" id="content-producers"></a>1.3.1.4Content producers</h5><p>This type of user comprises authors, translators and other types of content author. +<h5><a name="content-producers" id="content-producers"></a>1.3.1.4 Content producers</h5><p>This type of user comprises authors, translators and other types of content author. The markup proposed in this specification may be used by them to mark up specific bits of content. Aside: The burden of inserting markup can be removed from content producers by relating the ITS information to relevant bits of content in a global @@ -322,7 +324,7 @@ way, apart from the ITS 2.0 standard. One way would be to allow HTML in these fields if possible, or using an extra field which allows HTML input and save the plain text of this extra field in the plain text field.</p></div><div class="div4"> -<h5><a name="users_machine-translation" id="users_machine-translation"></a>1.3.1.5Machine Translation Systems</h5><p>This type of service is intended for a broad user community ranging from developers +<h5><a name="users_machine-translation" id="users_machine-translation"></a>1.3.1.5 Machine Translation Systems</h5><p>This type of service is intended for a broad user community ranging from developers and integrators through translation companies and agencies, freelance translators and post-editors to ordinary translation consumers and other types of MT employment. Data categories are envisaged for supporting and guiding the different automated @@ -333,7 +335,7 @@ third party users, for example, provenance information and quality scoring, and add relevant information for follow-on tasks, processes and services, such as MT post-editing, MT training and MT terminological enhancement.</p></div><div class="div4"> -<h5><a name="users_text_analytics" id="users_text_analytics"></a>1.3.1.6Text Analytics</h5><p>These types of users fulfil the role of providing services for automatic generation +<h5><a name="users_text_analytics" id="users_text_analytics"></a>1.3.1.6 Text Analytics</h5><p>These types of users fulfil the role of providing services for automatic generation of metadata for improving localization, data integration or knowledge management workflows. This class of users comprises of developers and integrators of services that automate language technology tasks such as domain classification, named entity @@ -343,7 +345,7 @@ translation systems, search result relevance in information retrieval systems, as well as management and integration of unstructured data in knowledge management systems.</p></div><div class="div4"> -<h5><a name="users_localization_workflow_managers" id="users_localization_workflow_managers"></a>1.3.1.7Localization Workflow Managers</h5><p>This type of users is concerend with localization workflows in which content goes +<h5><a name="users_localization_workflow_managers" id="users_localization_workflow_managers"></a>1.3.1.7 Localization Workflow Managers</h5><p>This type of users is concerend with localization workflows in which content goes through certain steps: preparation for localization, start of the localization process by e.g. a conversion into a bitext format like <a title="" href="#xliff">[XLIFF]</a>, the actual localization by human translators or machine translation and other adaptations of content, and finally the integration of the @@ -479,10 +481,11 @@ <strong class="hl-tag" style="color: #000096"><link</strong> <span class="hl-attribute" style="color: #F5844C">href</span>=<span class="hl-value" style="color: #993300">EX-translateRule-html5-1.xml</span> <span class="hl-attribute" style="color: #F5844C">rel</span>=<span class="hl-value" style="color: #993300">its-rules</span><strong class="hl-tag" style="color: #000096">></strong> <strong class="hl-tag" style="color: #000096"></head></strong> <strong class="hl-tag" style="color: #000096"><body></strong> - <strong class="hl-tag" style="color: #000096"><p></strong>This sentence should be translated, but code names like the <strong class="hl-tag" style="color: #000096"><code></strong>span<strong class="hl-tag" style="color: #000096"></code></strong> element should not be translated.<strong class="hl-tag" style="color: #000096"></p></strong> + <strong class="hl-tag" style="color: #000096"><p></strong>This sentence should be translated, but code names like the <strong class="hl-tag" style="color: #000096"><code></strong>span<strong class="hl-tag" style="color: #000096"></code></strong> element should not be translated. + Of course there are always exceptions: certain code values should be translated, + e.g. to a value in your language like <strong class="hl-tag" style="color: #000096"><code</strong> <span class="hl-attribute" style="color: #F5844C">translate</span>=<span class="hl-value" style="color: #993300">yes</span><strong class="hl-tag" style="color: #000096">></strong>warning<strong class="hl-tag" style="color: #000096"></code></strong>.<strong class="hl-tag" style="color: #000096"></p></strong> <strong class="hl-tag" style="color: #000096"></body></strong> -<strong class="hl-tag" style="color: #000096"></html></strong> -</pre></div><p>[Source file: <a href="examples/html5/EX-translate-html5-global-1.html">examples/html5/EX-translate-html5-global-1.html</a>]</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-translate-html5-global-1-rules-file" id="EX-translate-html5-global-1-rules-file"></a>Example 9: ITS rules file linked from HTML</div><p>The rules file linked in <a href="#EX-translate-html5-global-1">Example 8</a>.</p><div class="exampleInner"><pre><strong class="hl-tag" style="color: #000096"><its:rules</strong> <span class="hl-attribute" style="color: #F5844C">version</span>=<span class="hl-value" style="color: #993300">"2.0"</span> <span class="hl-attribute" style="color: #F5844C">xmlns:its</span>=<span class="hl-value" style="color: #993300">"http://www.w3.org/2005/11/its"</span> +<strong class="hl-tag" style="color: #000096"></html></strong></pre></div><p>[Source file: <a href="examples/html5/EX-translate-html5-global-1.html">examples/html5/EX-translate-html5-global-1.html</a>]</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-translate-html5-global-1-rules-file" id="EX-translate-html5-global-1-rules-file"></a>Example 9: ITS rules file linked from HTML</div><p>The rules file linked in <a href="#EX-translate-html5-global-1">Example 8</a>.</p><div class="exampleInner"><pre><strong class="hl-tag" style="color: #000096"><its:rules</strong> <span class="hl-attribute" style="color: #F5844C">version</span>=<span class="hl-value" style="color: #993300">"2.0"</span> <span class="hl-attribute" style="color: #F5844C">xmlns:its</span>=<span class="hl-value" style="color: #993300">"http://www.w3.org/2005/11/its"</span> <span class="hl-attribute" style="color: #F5844C">xmlns:h</span>=<span class="hl-value" style="color: #993300">"http://www.w3.org/1999/xhtml"</span><strong class="hl-tag" style="color: #000096">></strong> <strong class="hl-tag" style="color: #000096"><its:translateRule</strong> <span class="hl-attribute" style="color: #F5844C">translate</span>=<span class="hl-value" style="color: #993300">"no"</span> <span class="hl-attribute" style="color: #F5844C">selector</span>=<span class="hl-value" style="color: #993300">"//h:code"</span><strong class="hl-tag" style="color: #000096">/></strong> <strong class="hl-tag" style="color: #000096"></its:rules></strong> @@ -666,12 +669,10 @@ selected nodes. For example, the <a href="#locNote-datacat">Localization Note</a> data category can add information to selected nodes (using a <code class="its-elem-markup">locNote</code> element), or point to existing information elsewhere in the document (using a - <code class="its-attr-markup">locNotePointer</code> attribute).</p><p>The functionality of adding information to the selected nodes is available for each - data category except <a href="#language-information">Language Information</a>. - Pointing to existing information is not possible for data categories that express - <em>a closed set of values</em>; that is: <a href="#trans-datacat">Translate</a>, <a href="#directionality">Directionality</a>, <a href="#LocaleFilter">Locale Filter</a> and <a href="#elements-within-text">Elements Within Text</a>.</p><p>The functionalities of adding information and pointing to existing information are + <code class="its-attr-markup">locNotePointer</code> attribute).</p><p>The <a href="#datacategories-overview">data category overview table</a>, in <a class="section-ref" href="#datacategories-defaults-etc">Section 8.1: Position, Defaults, Inheritance and Overriding of Data Categories</a>, + provides an overview of what data categories allow to point to existing information or to add information.</p><p>The functionalities of adding information and pointing to existing information are <em>mutually exclusive</em>. That is to say, attributes for pointing and adding - must not appear at the same rule element.</p></div></div><div class="div1"> + the same information must not appear at the same rule element.</p></div></div><div class="div1"> <h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="notation-terminology" id="notation-terminology"></a>3 Notation and Terminology</h2><p> <em>This section is normative.</em> </p><div class="div2"> @@ -689,7 +690,7 @@ and localization of XML schemas and documents.] The concept of a data category is independent of its implementation in an XML and HTML environment (e.g. using an element or attribute).</p><p>For each data category, ITS distinguishes between the following:</p><ul><li><p>the prose description, see <a class="section-ref" href="#datacategory-description">Section 8: Description of Data Categories</a></p></li><li><p>schema language independent formalization, see the "implementation" subsections in - <a class="section-ref" href="#datacategory-description">Section 8: Description of Data Categories</a></p></li><li><p>schema language specific implementations, see <a class="section-ref" href="#its-schemas">Appendix D: Schemas for ITS</a></p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="d0e1177" id="d0e1177"></a>Example 13: A data category and its implementation</div><p>The <a href="#trans-datacat">Translate</a> data category conveys information as + <a class="section-ref" href="#datacategory-description">Section 8: Description of Data Categories</a></p></li><li><p>schema language specific implementations, see <a class="section-ref" href="#its-schemas">Appendix D: Schemas for ITS</a></p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="d0e1156" id="d0e1156"></a>Example 13: A data category and its implementation</div><p>The <a href="#trans-datacat">Translate</a> data category conveys information as to whether a piece of content should be translated or not.</p><p>The simplest formalization of this prose description on a schema language independent level is a <code class="its-attr-markup">translate</code> attribute with two possible values: "yes" and "no". An implementation on a schema language specific @@ -880,12 +881,9 @@ </p><div class="note"><p class="prefix"><b>Note:</b></p><p>Additional definitions about processing of HTML are given in <a class="section-ref" href="#html5-markup">Section 6: Using ITS Markup in HTML</a>.</p></div><div class="div2"> <h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="its-version-attribute" id="its-version-attribute"></a>5.1 Indicating the Version of ITS</h3><p>The version of the ITS schema defined in this specification is "2.0". The version is indicated by the ITS <code class="its-attr-markup">version</code> attribute. This attribute is - mandatory for the <code class="its-elem-markup">rules</code> element, where it <a href="#rfc-keywords">MUST</a> be in no namespace. If there is no <code class="its-elem-markup">rules</code> element in an XML - document, a prefixed ITS <code class="its-attr-markup">version</code> attribute (e.g. <code>its:version</code>) - <a href="#rfc-keywords">MUST</a> be provided at the root element of the - document. If there is both a <code class="its-attr-markup">version</code> attribute at the root element and a - <code class="its-elem-markup">rules</code> element in a document, they <a href="#rfc-keywords">MUST NOT</a> - specify different versions.</p><p>External, linked rules can have different versions than internal rules.</p></div><div class="div2"> + mandatory for the <code class="its-elem-markup">rules</code> element, where it <a href="#rfc-keywords">MUST</a> be in no namespace.</p><p>If there is no <code class="its-elem-markup">rules</code> element in an XML document, a prefixed ITS <code class="its-attr-markup">version</code> attribute (e.g. <code>its:version</code>) + <a href="#rfc-keywords">MUST</a> be provided on the element where the ITS markup is used, or on one of its ancestors. + There <a href="#rfc-keywords">MUST NOT</a> be two different versions of ITS in the same document.</p><p>External, linked rules can have different versions than internal rules.</p></div><div class="div2"> <h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="datacategory-locations" id="datacategory-locations"></a>5.2 Locations of Data Categories</h3><p>ITS data categories can appear in two places:</p><ul><li><p><a href="#selection-global">Global rules</a>: the selection is realized within a <code class="its-elem-markup">rules</code> element. It contains <a href="#rule-elements">rule elements</a> for each data category. Each rule element has a <code class="its-attr-markup">selector</code> @@ -905,13 +903,9 @@ existing information in the document. For example, the <a href="#locNote-datacat">Localization Note</a> data category can be used for adding notes to selected nodes, or for pointing to existing notes in the document. For the former purpose, a <code class="its-elem-markup">locNote</code> element can be used. For the latter purpose, a - <code class="its-attr-markup">locNotePointer</code> attribute can be used.</p><p>Each data category allows users to add information to the selected nodes except for - <a href="#language-information">language information</a>. Pointing to existing - information is not possible for data categories that express <em>a closed set of - values</em>, that is: <a href="#trans-datacat">Translate</a>, <a href="#directionality">Directionality</a>, <a href="#LocaleFilter">Locale - Filter</a>, and <a href="#elements-within-text">Elements Within - Text</a>.</p><p>The functionalities of adding information and pointing to existing information are - <em>mutually exclusive</em>. That is: markup for pointing and adding <a href="#rfc-keywords">MUST NOT</a> appear in the same rule element. </p><p>Global rules can appear in the XML document they will be applied to, or in a separate + <code class="its-attr-markup">locNotePointer</code> attribute can be used.</p><p>The <a href="#datacategories-overview">data category overview table</a>, in <a class="section-ref" href="#datacategories-defaults-etc">Section 8.1: Position, Defaults, Inheritance and Overriding of Data Categories</a>, + provides an overview of what data categories allow to point to existing information or to add information.</p><p>The functionalities of adding information and pointing to existing information are + <em>mutually exclusive</em>. That is: markup for pointing and adding the same information <a href="#rfc-keywords">MUST NOT</a> appear in the same rule element.</p><p>Global rules can appear in the XML document they will be applied to, or in a separate XML document. The precedence of their processing depends on these variations. See also <a class="section-ref" href="#selection-precedence">Section 5.5: Precedence between Selections</a>.</p></div><div class="div3"> <h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="selection-local" id="selection-local"></a>5.2.2 Local Selection in an XML Document</h4><p>Local selection in XML documents is realized with <a href="#local-attributes">ITS @@ -951,9 +945,9 @@ actual query language. The query language is set by <code class="its-attr-markup">queryLanguage</code> attribute on <code class="its-elem-markup">rules</code> element. If <code class="its-attr-markup">queryLanguge</code> is not specified XPath 1.0 is used as a default query language.</p></div><div class="div3"> -<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2071" id="d0e2071"></a>5.3.2 XPath 1.0</h4><p>XPath 1.0 is identified by <code>xpath</code> value in <code class="its-attr-markup">queryLanguage</code> +<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2033" id="d0e2033"></a>5.3.2 XPath 1.0</h4><p>XPath 1.0 is identified by <code>xpath</code> value in <code class="its-attr-markup">queryLanguage</code> attribute.</p><div class="div4"> -<h5><a name="d0e2082" id="d0e2082"></a>5.3.2.1Absolute selector</h5><p>The absolute selector <a href="#rfc-keywords">MUST</a> be an XPath expression +<h5><a name="d0e2044" id="d0e2044"></a>5.3.2.1 Absolute selector</h5><p>The absolute selector <a href="#rfc-keywords">MUST</a> be an XPath expression which starts with "<code>/</code>". That is, it must be an <a href="http://www.w3.org/TR/xpath/#NT-AbsoluteLocationPath"> AbsoluteLocationPath</a> or union of <a href="http://www.w3.org/TR/xpath/#NT-AbsoluteLocationPath"> AbsoluteLocationPath</a>s as described in <a href="#xpath">XPath 1.0</a>. @@ -975,10 +969,10 @@ <strong class="hl-tag" style="color: #000096"></its:rules></strong> </pre></div><p>[Source file: <a href="examples/xml/EX-selection-global-2.xml">examples/xml/EX-selection-global-2.xml</a>]</p></div></div><div class="div4"> -<h5><a name="xpath-relative-selector" id="xpath-relative-selector"></a>5.3.2.2Relative selector</h5><p>The relative selector <a href="#rfc-keywords">MUST</a> use a <a href="http://www.w3.org/TR/xpath/#NT-RelativeLocationPath">RelativeLocationPath</a> or an <a href="http://www.w3.org/TR/xpath/#NT-AbsoluteLocationPath">AbsoluteLocationPath</a> as described in <a href="#xpath">XPath 1.0</a>. +<h5><a name="xpath-relative-selector" id="xpath-relative-selector"></a>5.3.2.2 Relative selector</h5><p>The relative selector <a href="#rfc-keywords">MUST</a> use a <a href="http://www.w3.org/TR/xpath/#NT-RelativeLocationPath">RelativeLocationPath</a> or an <a href="http://www.w3.org/TR/xpath/#NT-AbsoluteLocationPath">AbsoluteLocationPath</a> as described in <a href="#xpath">XPath 1.0</a>. The XPath expression is evaluated relative to the nodes selected by the selector attribute.</p><p id="pointer-attributes-list">The following attributes point to existing - information: <code class="its-attr-markup">allowedCharactersPointer</code>, <code class="its-attr-markup">disambigClassPointer</code>, + information: <code class="its-attr-markup">allowedCharactersPointer</code>, <code class="its-attr-markup">disambigClassRefPointer</code>, <code class="its-attr-markup">disambigIdentPointer</code>, <code class="its-attr-markup">disambigIdentRefPointer</code>, <code class="its-attr-markup">disambigSourcePointer</code>, <code class="its-attr-markup">domainPointer</code>, <code class="its-attr-markup">externalResourceRefPointer</code>, <code class="its-attr-markup">langPointer</code>, @@ -990,18 +984,18 @@ with the following changes:</p><ul><li><p>Nodes selected by the expression in the <code class="its-attr-markup">selector</code> attribute form the current node list.</p></li><li><p>Context node comes from the current node list.</p></li><li><p>The context position comes from the position of the current node in the current node list; the first position is 1.</p></li><li><p>The context size comes from the size of the current node list.</p></li></ul></div></div><div class="div3"> -<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2265" id="d0e2265"></a>5.3.3 CSS Selectors</h4><div class="note"><p class="prefix"><b>Note:</b></p><p id="css-selectors-feature-at-risk">As of writing the working group has no +<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2224" id="d0e2224"></a>5.3.3 CSS Selectors</h4><div class="note"><p class="prefix"><b>Note:</b></p><p id="css-selectors-feature-at-risk">As of writing the working group has no implememtation commitment for CSS selectors. If this doesn't change CSS selectors will be marked as feature at risk for the candidate recommendation draft.</p></div><p>CSS Selectors are identified by <code>css</code> value in <code class="its-attr-markup">queryLanguage</code> attribute.</p><div class="div4"> -<h5><a name="d0e2279" id="d0e2279"></a>5.3.3.1Absolute selector</h5><p>Absolute selector <a href="#rfc-keywords">MUST</a> be interpreted as selector +<h5><a name="d0e2238" id="d0e2238"></a>5.3.3.1 Absolute selector</h5><p>Absolute selector <a href="#rfc-keywords">MUST</a> be interpreted as selector as defined in <a href="#css3-selectors">Selectors Level 3</a>. Both simple selectors and groups of selectors can be used.</p></div><div class="div4"> -<h5><a name="d0e2290" id="d0e2290"></a>5.3.3.2Relative selector</h5><p>Relative selector <a href="#rfc-keywords">MUST</a> be interpreted as selector +<h5><a name="d0e2249" id="d0e2249"></a>5.3.3.2 Relative selector</h5><p>Relative selector <a href="#rfc-keywords">MUST</a> be interpreted as selector as defined in <a href="#css3-selectors">Selectors Level 3</a>. Selector is not evaluated against the complete document tree but only against subtrees rooted at nodes selected by selector in the <code class="its-attr-markup">selector</code> attribute.</p></div></div><div class="div3"> -<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2304" id="d0e2304"></a>5.3.4 Additional query languages</h4><p>ITS processors <a href="#rfc-keywords">MAY</a> support additional query +<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d0e2263" id="d0e2263"></a>5.3.4 Additional query languages</h4><p>ITS processors <a href="#rfc-keywords">MAY</a> support additional query languages. For each additional query language processor <a href="#rfc-keywords">MUST</a> define:</p><ul><li><p>identifier of query language used in <code class="its-attr-markup">queryLanguage</code>;</p></li><li><p>rules for evaluating absolute selector to collection of nodes;</p></li><li><p>rules for evaluating relative selector to collection of nodes.</p></li></ul><p>Future versions of this specification <a href="#rfc-keywords">MAY</a> define additional query languages. The following query language identifiers are reserved: <code>xpath</code>, <code>css</code>, <code>xpath2</code>, <code>xpath3</code>, @@ -1095,8 +1089,7 @@ </pre></div><p>[Source file: <a href="examples/xml/EX-link-external-rules-4.xml">examples/xml/EX-link-external-rules-4.xml</a>]</p></div><p>Applications processing global ITS markup <a href="#rfc-keywords">MUST</a> recognize the XLink <code class="its-attr-markup">href</code> attribute in the <code class="its-elem-markup">rules</code> element; they <a href="#rfc-keywords">MUST</a> load the corresponding referenced document and process its rules element before processing the content of the <code class="its-elem-markup">rules</code> element - where the original XLink <code class="its-attr-markup">href</code> attribute is.</p><p>External rules may also have links to other external rules. The linking mechanism is - recursive, the deepest rules being overridden by the top-most rules, if any.</p></div><div class="div2"> + where the original XLink <code class="its-attr-markup">href</code> attribute is.</p><p>External rules may also have links to other external rules (see <a href="#EX-link-external-rules-2">Example 20</a>). The linking mechanism is recursive in a depth-first approach, and subsequently after the processing the rules MUST be read top-down (see <a href="#EX-link-external-rules-3">Example 21</a>).</p></div><div class="div2"> <h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="selection-precedence" id="selection-precedence"></a>5.5 Precedence between Selections</h3><p>The following precedence order is defined for selections of ITS information in various positions (the first item in the list has the highest precedence):</p><ol class="depth1"><li><p>Selection via explicit (that is, not inherited) local ITS markup in documents (<a href="#local-attributes">ITS local attributes</a> on @@ -1192,77 +1185,77 @@ tuple with the XPath expressions (X,T). Since the text nodes have a certain order we now have a list of ordered tuples ((x0,t0), (x1,t1), ..., (xn,tn)).</p></li><li><p id="its2nif-algorithm-step4">STEP 4 (optional): Serialize as XML or as RDF. The list with the XPath-to-text mapping can also be kept in memory. Part of a - serialization example is given below.</p></li></ul><div class="exampleInner"><div class="exampleOuter"><pre>@prefix itsrdf: <strong class="hl-tag" style="color: #000096"><http:</strong>//www.w3.org/2005/11/its/rdf#> . -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#xpath(x0)> - itsrdf:xpath2nif <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_b0_e0> -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#xpath(x1)> - itsrdf:xpath2nif <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_b1_e1> + serialization example is given below.</p></li></ul><div class="exampleInner"><div class="exampleOuter"><pre>@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> . +<http://example.com/exampledoc.html#xpath(x0)> + itsrdf:xpath2nif <http://example.com/exampledoc.html#offset_b0_e0> +<http://example.com/exampledoc.html#xpath(x1)> + itsrdf:xpath2nif <http://example.com/exampledoc.html#offset_b1_e1> # ... -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#xpath(xn)> - itsrdf:xpath2nif <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_bn_en> -<strong class="hl-tag" style="color: #000096"><mappings></strong> - <strong class="hl-tag" style="color: #000096"><mapping</strong> <span class="hl-attribute" style="color: #F5844C">x</span>=<span class="hl-value" style="color: #993300">"xpath(x0)"</span> <span class="hl-attribute" style="color: #F5844C">b</span>=<span class="hl-value" style="color: #993300">"b0"</span> <span class="hl-attribute" style="color: #F5844C">e</span>=<span class="hl-value" style="color: #993300">"e0"</span><strong class="hl-tag" style="color: #000096"> /></strong> - <strong class="hl-tag" style="color: #000096"><mapping</strong> <span class="hl-attribute" style="color: #F5844C">x</span>=<span class="hl-value" style="color: #993300">"xpath(x1)"</span> <span class="hl-attribute" style="color: #F5844C">b</span>=<span class="hl-value" style="color: #993300">"b1"</span> <span class="hl-attribute" style="color: #F5844C">e</span>=<span class="hl-value" style="color: #993300">"e1"</span><strong class="hl-tag" style="color: #000096"> /></strong> - <em class="hl-comment" style="color: silver"><!-- ... --></em> - <strong class="hl-tag" style="color: #000096"><mapping</strong> <span class="hl-attribute" style="color: #F5844C">x</span>=<span class="hl-value" style="color: #993300">"xpath(xn)"</span> <span class="hl-attribute" style="color: #F5844C">b</span>=<span class="hl-value" style="color: #993300">"bn"</span> <span class="hl-attribute" style="color: #F5844C">e</span>=<span class="hl-value" style="color: #993300">"en"</span><strong class="hl-tag" style="color: #000096"> /></strong> -<strong class="hl-tag" style="color: #000096"></mappings></strong></pre></div></div><p>where</p><div class="exampleInner"><div class="exampleOuter"><pre>b0 = 0 +<http://example.com/exampledoc.html#xpath(xn)> + itsrdf:xpath2nif <http://example.com/exampledoc.html#offset_bn_en> +<mappings> + <mapping x="xpath(x0)" b="b0" e="e0" /> + <mapping x="xpath(x1)" b="b1" e="e1" /> + <!-- ... --> + <mapping x="xpath(xn)" b="bn" e="en" /> +</mappings></pre></div></div><p>where</p><div class="exampleInner"><div class="exampleOuter"><pre>b0 = 0 e0 = b0 + (Number of characters of t0) b1 = e0 +1 e1 = b1 + (Number of characters of t1) ... bn = e(n-1) +1 en = bn + (Number of characters of tn) -</pre></div></div><p>Example (continued)</p><div class="exampleInner"><div class="exampleOuter"><pre>@prefix itsrdf: <strong class="hl-tag" style="color: #000096"><http:</strong>//www.w3.org/2005/11/its/rdf#> . +</pre></div></div><p>Example (continued)</p><div class="exampleInner"><div class="exampleOuter"><pre>@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> . # "Welcome to " -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[1])> - itsrdf:nif <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_0_11> . +<http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[1])> + itsrdf:nif <http://example.com/exampledoc.html#offset_0_11> . # "Dublin" -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/span[1]/text()[1])> - itsrdf:nif <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_11_17> . +<http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/span[1]/text()[1])> + itsrdf:nif <http://example.com/exampledoc.html#offset_11_17> . # " in " -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[2])> - itsrdf:nif <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_17_21> . +<http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[2])> + itsrdf:nif <http://example.com/exampledoc.html#offset_17_21> . # "Ireland" -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/b[1]/text()[1])> - itsrdf:nif <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_21_28> . +<http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/b[1]/text()[1])> + itsrdf:nif <http://example.com/exampledoc.html#offset_21_28> . # "!" -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[3])> - itsrdf:nif <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_28_29> . +<http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text()[3])> + itsrdf:nif <http://example.com/exampledoc.html#offset_28_29> . # "Welcome to Dublin Ireland!" -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text())> - itsrdf:nif <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_0_29> . -<strong class="hl-tag" style="color: #000096"><mappings></strong> - <strong class="hl-tag" style="color: #000096"><mapping</strong> <span class="hl-attribute" style="color: #F5844C">x</span>=<span class="hl-value" style="color: #993300">"xpath(/html/body[1]/h2[1]/text()[1])"</span> <span class="hl-attribute" style="color: #F5844C">b</span>=<span class="hl-value" style="color: #993300">"0"</span> <span class="hl-attribute" style="color: #F5844C">e</span>=<span class="hl-value" style="color: #993300">"11"</span><strong class="hl-tag" style="color: #000096"> /></strong> - <strong class="hl-tag" style="color: #000096"><mapping</strong> <span class="hl-attribute" style="color: #F5844C">x</span>=<span class="hl-value" style="color: #993300">"xpath(/html/body[1]/h2[1]/span[1]/text()[1])"</span> <span class="hl-attribute" style="color: #F5844C">b</span>=<span class="hl-value" style="color: #993300">"11"</span> <span class="hl-attribute" style="color: #F5844C">e</span>=<span class="hl-value" style="color: #993300">"17"</span><strong class="hl-tag" style="color: #000096"> /></strong> - <strong class="hl-tag" style="color: #000096"><mapping</strong> <span class="hl-attribute" style="color: #F5844C">x</span>=<span class="hl-value" style="color: #993300">"xpath(/html/body[1]/h2[1]/text()[2])"</span> <span class="hl-attribute" style="color: #F5844C">b</span>=<span class="hl-value" style="color: #993300">"17"</span> <span class="hl-attribute" style="color: #F5844C">e</span>=<span class="hl-value" style="color: #993300">"21"</span><strong class="hl-tag" style="color: #000096"> /></strong> - <strong class="hl-tag" style="color: #000096"><mapping</strong> <span class="hl-attribute" style="color: #F5844C">x</span>=<span class="hl-value" style="color: #993300">"xpath(/html/body[1]/h2[1]/b[1]/text()[1])"</span> <span class="hl-attribute" style="color: #F5844C">b</span>=<span class="hl-value" style="color: #993300">"21"</span> <span class="hl-attribute" style="color: #F5844C">e</span>=<span class="hl-value" style="color: #993300">"28"</span><strong class="hl-tag" style="color: #000096"> /></strong> - <strong class="hl-tag" style="color: #000096"><mapping</strong> <span class="hl-attribute" style="color: #F5844C">x</span>=<span class="hl-value" style="color: #993300">"xpath(/html/body[1]/h2[1]/text()[3])"</span> <span class="hl-attribute" style="color: #F5844C">b</span>=<span class="hl-value" style="color: #993300">"28"</span> <span class="hl-attribute" style="color: #F5844C">e</span>=<span class="hl-value" style="color: #993300">"29"</span><strong class="hl-tag" style="color: #000096"> /></strong> - <strong class="hl-tag" style="color: #000096"><mapping</strong> <span class="hl-attribute" style="color: #F5844C">x</span>=<span class="hl-value" style="color: #993300">"xpath(/html/body[1]/h2[1])"</span> <span class="hl-attribute" style="color: #F5844C">b</span>=<span class="hl-value" style="color: #993300">"0"</span> <span class="hl-attribute" style="color: #F5844C">e</span>=<span class="hl-value" style="color: #993300">"29"</span><strong class="hl-tag" style="color: #000096"> /></strong> -<strong class="hl-tag" style="color: #000096"></mappings></strong></pre></div></div><ul><li><p id="its2nif-algorithm-step5">STEP 5: Create a context URI and attach the +<http://example.com/exampledoc.html#xpath(/html/body[1]/h2[1]/text())> + itsrdf:nif <http://example.com/exampledoc.html#offset_0_29> . +<mappings> + <mapping x="xpath(/html/body[1]/h2[1]/text()[1])" b="0" e="11" /> + <mapping x="xpath(/html/body[1]/h2[1]/span[1]/text()[1])" b="11" e="17" /> + <mapping x="xpath(/html/body[1]/h2[1]/text()[2])" b="17" e="21" /> + <mapping x="xpath(/html/body[1]/h2[1]/b[1]/text()[1])" b="21" e="28" /> + <mapping x="xpath(/html/body[1]/h2[1]/text()[3])" b="28" e="29" /> + <mapping x="xpath(/html/body[1]/h2[1])" b="0" e="29" /> +</mappings></pre></div></div><ul><li><p id="its2nif-algorithm-step5">STEP 5: Create a context URI and attach the whole concatenated text of the document as reference.</p></li><li><p id="its2nif-algorithm-step6">STEP 6: Now attach any ITS metadata items from the XML/HTML/DOM input to respective NIF URIs.</p></li><li><p id="its2nif-algorithm-step7">STEP 7: Omit all irrelevant URIs (those that - do not carry annotations, they will just bloat the data).</p></li></ul><div class="exampleInner"><div class="exampleOuter"><pre>@prefix itsrdf: <strong class="hl-tag" style="color: #000096"><http:</strong>//www.w3.org/2005/11/its/rdf#> . -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_0_29> + do not carry annotations, they will just bloat the data).</p></li></ul><div class="exampleInner"><div class="exampleOuter"><pre>@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> . +<http://example.com/exampledoc.html#offset_0_29> rdf:type str:Context ; rdf:type str:OffsetBasedString ; # concatenate the whole text str:isString "$(t0+t1+t2+...+tn)" ; - itsrdf:translate "yes"^^<strong class="hl-tag" style="color: #000096"><http:</strong>//www.w3.org/TR/its-2.0/its.xsd#yesOrNo> ; - str:occursIn <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html> . -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_11_17> + itsrdf:translate "yes"^^<http://www.w3.org/TR/its-2.0/its.xsd#yesOrNo> ; + str:occursIn <http://example.com/exampledoc.html> . +<http://example.com/exampledoc.html#offset_11_17> rdf:type str:String ; rdf:type str:OffsetBasedString ; - itsrdf:translate "no"^^<strong class="hl-tag" style="color: #000096"><http:</strong>//www.w3.org/TR/its-2.0/its.xsd#yesOrNo> ; - itsrdf:disambigIdentRef <strong class="hl-tag" style="color: #000096"><http:</strong>//dbpedia.org/resource/Dublin> ; - str:referenceContext <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_0_29> . -<strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_21_28> + itsrdf:translate "no"^^<http://www.w3.org/TR/its-2.0/its.xsd#yesOrNo> ; + itsrdf:disambigIdentRef <http://dbpedia.org/resource/Dublin> ; + str:referenceContext <http://example.com/exampledoc.html#offset_0_29> . +<http://example.com/exampledoc.html#offset_21_28> rdf:type str:String ; rdf:type str:OffsetBasedString ; - itsrdf:translate "no"^^<strong class="hl-tag" style="color: #000096"><http:</strong>//www.w3.org/TR/its-2.0/its.xsd#yesOrNo> ; - str:referenceContext <strong class="hl-tag" style="color: #000096"><http:</strong>//example.com/exampledoc.html#offset_0_29> . + itsrdf:translate "no"^^<http://www.w3.org/TR/its-2.0/its.xsd#yesOrNo> ; + str:referenceContext <http://example.com/exampledoc.html#offset_0_29> . </pre></div></div><p>A complete sample output in RDF/XML format after step 7, given the input document <a href="#EX-HTML-whitespace-normalization">Example 25</a>, is available at <a href="examples/nif/EX-nif-conversion-output.xml">examples/nif/EX-nif-conversion-output.xml</a>.</p><div class="note"><p class="prefix"><b>Note:</b></p><p>The conversion to NIF is the basis for natural language processing (NLP) applications, creating for example named entity annotations. A non-normative algorithm - to integrate these annotations into the original input document is given in <a class="section-ref" href="#nif-backconversion">Appendix G: Conversion NIF2ITS</a>. The algorithm in that appendix is + to integrate these annotations into the original input document is given in <a class="section-ref" href="#nif-backconversion">Appendix F: Conversion NIF2ITS</a>. The algorithm in that appendix is non-normative since many choices depend on the actual NLP application.</p></div></div><div class="div2"> <h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="its-tool-annotation" id="its-tool-annotation"></a>5.8 ITS Tools Annotation</h3><p>In some cases, it may be important for instances of data categories to be associated with information about the processor that generated them. For example, the score of the @@ -1278,7 +1271,7 @@ individual data categories in a document, independently from data category annotations themselves.</p><p>The attribute <code class="its-attr-markup">annotatorsRef</code> provides a way to associate all the annotations of a [1271 lines skipped]
Received on Thursday, 21 March 2013 13:54:06 UTC