- From: Yves Savourel via cvs-syncmail <cvsmail@w3.org>
- Date: Tue, 16 Oct 2012 13:17:09 +0000
- To: public-multilingualweb-lt-commits@w3.org
Update of /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20 In directory hutz:/tmp/cvs-serv7132 Modified Files: its20.html its20.odd Log Message: Implemented Disambiguation updates Index: its20.odd =================================================================== RCS file: /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.odd,v retrieving revision 1.185 retrieving revision 1.186 diff -u -d -r1.185 -r1.186 --- its20.odd 15 Oct 2012 13:05:33 -0000 1.185 +++ its20.odd 16 Oct 2012 13:17:07 -0000 1.186 @@ -3146,67 +3146,41 @@ <note type="ed">This data category is not completely stable yet.</note> <div xml:id="Disambiguation-definition"> <head>Definition</head> - <p>The <ref target="#Disambiguation">Disambiguation</ref>ref> data category - is used to indicate occurrences of specific concepts that may require - special handling in the localization process.</p> + <p>The <ref target="#Disambiguation">Disambiguation</ref> data category is used to + indicate occurrences of specific concepts that may require special + handling in the localization of the document.</p> <p>This data category can be used for several purposes, including, but not limited to:</p> <list type="unordered"> - <item>Informing translation systems that a fragment of text may be - subject to specific rules (e..g., concerning the translation of - proper names) or that it is has an official translation</item> - <item>Informing translation systems concerning the specific meaning of - phrases.</item> - <item>Informing content management and translation systems about the - type of underlying entity in order to enable processing based on a - specific type of the target, for example, personal names, product or - geographic names, chemical compounds, protein names, and so - forth.</item> - </list> - <p>We introduce the following concepts:</p> - <list type="unordered"> - <item>Entity Type Source: a domain of valid values, an identifier - collection for entity types. Unless specified, it will be derived by - default de-referencing mechanisms for the URI.</item> - <item>Entity Type: the type of the entity, being one of values within - the entity type source identifier collection.</item> - <item>Disambiguation type: the level of disambiguation (lexical concept, - ontology concept, entity). The disambiguation can happen at multiple - levels. For instance, the level of lexical concepts disambiguates - individual word surface forms, the level of ontology concepts - disambiguates into deeper semantics, and the entity disambiguation - works on the level of concrete instances. For instance, the word - <quote>City</quote> in <quote>I am going to the City</quote> may - be disambiguated in one of the WordNet synsets that can be - represented by <quote>city</quote>, an RDF ontology concept of a - City that could represent a subclass of a PopulatedPlace, or the - center area of a particular city, e.g. London City.</item> - <item>Disambiguation Source: the identifier collection source used for - locating the correct underlying identifier. It can be anything that - can representing a collection of identifiers for words, concepts or - entities, for instance, a knowledge base, an ontology or semantic - network. Unless specified, it will be derived by default - de-referencing mechanisms for the URI.</item> - <item>Disambiguation Identifier: an identifier, unique within the - current disambiguation identifier collection, specifying the actual - identifier (meaning, concept or entity) behind the selected - content.</item> + <item>Informing translation systems that this fragment of text may not be literally translated, but subject to specific proper name translation rules or official translations, as well as a very specific meaning of the phrases.</item> + <item>Informing content management and translation systems about the type of the underlying entity in order to enable processing based on a specific type of the target, for example, when handling personal names, product names or geographic names, chemical compounds, protein names and similar.</item> </list> - <p>Two types of Disambiguation data categories are needed to identify:</p> + <p>Disambiguation is achieved by associating a selected fragment of text with an external web resource that can be referenced + by a translation or linguistic review agent in order to access the correct meaning or lexical use of the text and thereby + informing its translation.</p> + <p>A fragment of text can be disambiguated at different granularities, i.e. as a lexical concept, as an ontology concept, + or as a named entity.</p> + <p>As a lexical concept, the external reference can provide synonyms and example usage, e.g. using + service such as Wordnet.</p> + <p>As an ontology concept, the external reference can provide a formal conceptual definition within + a framework of related concepts.</p> + <p>As a named entity, the external reference can provide a description of the real world entity the text intends + to convey. For instance, the word 'City' in 'I am going to the City' may be disambiguated in one of the WordNet + synsets that can be represented by 'city', an ontology concept of a City that could represent a subclass of a + “PopulatedPlace” in the conceptual granularity level, or the central area of a particular city, e.g. City of London, + as interpreted in the entity granularity level. Linked data network, such as DBpedia, increasing interlink ontological + and named entity definitions for the same things as authored in different languages, offering a mechanism to + locate translations from the source language description.</p> + + <p>Two types of disambiguation are needed to identify:</p> <list type="unordered"> - <item>Entity type, which describes the type of the underlying entity - within a particular domain of types, as specified by the type source - identifier collection.</item> - <item>Disambiguation, which describes the actual underlying identifier - or meaning that the mention refers to, either in a knowledge base, - ontology or in a semantic network.</item> + <item>Disambiguation type class, which describes the type class of the underlying concept or entity of the fragment.</item> + <item>Disambiguation, which describes the actual underlying external resource that conveys the intended meaning of the fragment.</item> </list> - <p>Text analysis engines, such as named entity recognizers, named entity, - concept and word sense disambiguators can offer an easy way to create - this information. Content management tools can present and visualize - this information or use it to index their content. Machine translations - systems may use it for training and translation when dealing with proper - names and edge cases.</p> + <p>Text analysis engines, such as named entity recognizers, named entity, concept and word sense disambiguation + components can offer an easy way to create this information. Content management tools can present and visualize + this information or use it to index their content. Machine translations systems may use it for training and translation + when dealing with proper names and edge cases.</p> </div> <div xml:id="Disambiguation-implementation"> <head>Implementation</head> @@ -3214,60 +3188,52 @@ be expressed with global rules, or locally on an individual element. The information applies to the textual content of the element. There is no inheritance. The entity type follows inheritance rules.</p> + <note type="ed">The two sentences above seem contradictory.</note> <p xml:id="disambiguation-global">GLOBAL: The <gi>disambiguationRule</gi> element contains the following:</p> <list type="unordered"> <item>A required <att>selector</att> attribute. It contains an <ref target="#selectors">absolute selector</ref> which selects the nodes to which this rule applies.</item> - <item>An optional <att>entityTypeSourceRef</att> attribute that contains - a URI specifying the concrete identifier data source (knowledge - base, semantic network), used to determine the entity type.</item> - <item>An optional <att>entityTypeSourcePointer</att> attribute that - contains a relative XPath expression pointing to a node that - represents the identifier data source (knowledge base, semantic - network), used to determine the entity type.</item> - <item>An optional <att>entityTypeSourceRefPointer</att> attribute that - contains a relative XPath expression pointing to a node that holds - the URI that represents the identifier data source (knowledge base, - semantic network), used to determine the entity type.</item> - <item>An optional <att>entityTypeRef</att> attribute that contains a - URI, specifying the entity type behind the selector.</item> - <item>An optional <att>entityTypePointer</att> attribute that contains a - relative XPath expression pointing to a node specifying the entity - type behind the selector.</item> - <item>An optional <att>entityTypeRefPointer</att> attribute that - contains a relative XPath expression pointing to a node that holds - the URI that specifies the entity type behind the selector.</item> - <item>An optional <att>disambigType</att> attribute that contains a - string, specifying the specific semantics of the disambiguation. It - can be one of "lexicalConcept", "ontologyConcept", or - "entity".</item> - <item>An optional <att>disambigSourceRef</att> attribute. It contains a - URI representing the disambiguation identifier collection - source.</item> - <item>An optional <att>disambigSourcePointer</att> attribute. It - contains a relative XPath expression pointing to a node that - represents the disambiguation identifier collection source.</item> - <item>An optional <att>disambigSourceRefPointer</att> attribute. It - contains a relative XPath expression pointing to a node that holds - the URI that represents the disambiguation identifier collection - source.</item> - <item>An optional <att>disambigIdentRef</att> attribute. It contains a - URI that represents a unique identifier within the identifier - collection.</item> - <item>An optional <att>disambigIdentPointer</att> attribute. It contains - a relative XPath expression pointing to a node that represents a - unique identifier within the identifier collection.</item> - <item>An optional <att>disambigIdentRefPointer</att> attribute. It - contains a relative XPath expression pointing to a node that - represents a unique identifier within the identifier - collection.</item> + <item>None of exactly one of the following: + <list> + <item>A <att>disambigClassRef</att> attribute that contains a URI, specifying the type class of the concept + or entity behind the selector.</item> + <item>A <att>disambigClassPointer</att> attribute that contains a <ref target="#selectors">relative selector</ref> + pointing to a node specifying the entity type class behind the selector.</item> + <item>A <att>disambigClassRefPointer</att> attribute that contains a <ref target="#selectors">relative selector</ref> + pointing to a node that holds a URI that specifies the entity type class behind the selector.</item> + </list> + </item> + <item>An optional <att>disambigGranularity</att> attribute that contains a string, specifying the granularity + level of the disambiguation. The value can be one of the following identifiers: + <code>lexicalConcept</code>, <code>ontologyConcept</code>, or <code>entity</code>.</item> + <item>An optional <att>disambigSource</att> attribute. It contains a string representing the disambiguation + identifier collection source.</item> + <item>None of exactly one of the following: + <list> + <item>A <att>disambigIdent</att> attribute. It contains a string that represents the disambiguation + identifier for the disambiguation target that is valid within the specified Disambiguation Source.</item> + <item>A <att>disambigIdentRef</att> attribute. It contains an URI that represents a unique identifier + for the disambiguation target.</item> + <item>A <att>disambigIdentPointer</att> attribute. It contains a <ref target="#selectors">relative selector</ref> + pointing to a node that represents a unique identifier for the disambiguation target.</item> + <item>a <att>disambigIdentRefPointer</att> attribute. It contains a <ref target="#selectors">relative selector</ref> + pointing to a node that holds a URI that represents a unique identifier for the disambiguation target.</item> + </list> + </item> + </list> + <p>When using a disambiguation rule, the user <ref target="#rfc2119">MUST</ref> use one of the use cases for disambiguation: + specifying the target type, or specifying the target identity. + For the latter, the user <ref target="#rfc2119">MUST</ref> use only one of the two addressing modes:</p> + <list> + <item>Using <att>disambigSource</att> and <att>disambigIdent</att> to specify the collection and the identifier itself.</item> + <item>Using one of <att>disambigIdentRef</att>, <att>disambigIdentPointer</att> or <att>disambigIdentRefPointer</att> using + a URI for the disambiguation target.</item> </list> <exemplum xml:id="EX-disambiguation-global-1"> - <head>Usage of <att>entityTypeSourceRef</att>, <att>enttiyTypeRef</att>, - <att>disambigSourceRef</att>, <att>disambigIdentRef</att> for - both entity and word sense disambiguation.</head> + <head>Usage of <att>entityTypeSourceRef</att>, <att>enttiyTypeRef</att>, <att>disambigSourceRef</att>, + <att>disambigIdentRef</att> for both entity and word sense disambiguation.</head> <egXML xmlns="http://www.tei-c.org/ns/Examples" target="examples/xml/EX-disambiguation-global-1.xml"/> </exemplum> @@ -3276,49 +3242,43 @@ available for the <ref target="#Disambiguation">Disambiguation</ref> data category:</p> <list type="unordered"> - <item>An optional <att>entityTypeSourceRef</att> attribute that contains - an URI specifying the concrete identifier data source (knowledge - base, semantic network), used to determine the entity type.</item> - <item>An optional <att>entityTypeRef</att> attribute that contains a URI - specifying the entity type behind the selector.</item> - <item>An optional <att>disambigType</att> attribute that contains a - string, specifying the specific semantics of the disambiguation. It - can be one of "lexicalConcept", "ontologyConcept", or - "entity".</item> - <item>An optional <att>disambigSourceRef</att> attribute. It contains a - URI representing the disambiguation identifier collection - source.</item> - <item>An optional <att>disambigIdentRef</att> attribute. It contains a - URI that represents a unique identifier within the identifier - collection.</item> + <item>An optional <att>disambigClassRef</att> attribute that contains a URI, specifying the type class + of the concept or entity behind the selector.</item> + <item>An optional <att>disambigGranularity</att> attribute that contains a string, specifying the + granularity level of the disambiguation. The value can be one of the following identifiers: + <code>lexicalConcept</code>, <code>ontologyConcept</code>, or <code>entity</code></item> + <item>An optional <att>disambigSource</att> attribute. It contains a string representing the + disambiguation identifier collection source.</item> + <item>An optional <att>disambigIdent</att> attribute. It contains a string, representing the + disambiguation identifier for the disambiguation target that is valid within the specified Disambiguation Source.</item> + <item>An optional <att>disambigIdentRef</att> attribute. It contains a URI that represents a unique + identifier for the disambiguation target.</item> + </list> + <p>The user <ref target="#rfc2119">MUST</ref> use only one of the two addressing modes for disambiguation:</p> + <list> + <item>Using <att>disambigSource</att> and <att>disambigIdent</att> to specify the collection + and the identifier itself.</item> + <item>Using <att>disambigIdentRef</att> using a URI for the disambiguation target</item> </list> <exemplum xml:id="EX-disambiguation-html5-local-1"> - <head>Local mixed usage of <att>entityTypeSourceRef</att>, - <att>enttiyTypeRef</att>, <att>disambigSourceRef</att>, - <att>disambigIdentRef</att> in HTML.</head> + <head>Local mixed usage of <att>entityTypeSourceRef</att>, <att>enttiyTypeRef</att>, + <att>disambigSourceRef</att>, <att>disambigIdentRef</att> in HTML.</head> <egXML xmlns="http://www.tei-c.org/ns/Examples" type="html5" target="examples/html5/EX-disambiguation-html5-local-1.html"/> </exemplum> <note> - <p>While the <att>entityTypeSourceRef</att> attribute allows for an - arbitrary domain of entity types, the implementors are encouraged to - use an existing repository of entity types as long as they satisfy - their requirements. For example, the Named Entity Recognition and - Disambiguation ontology (NERD): http://nerd.eurecom.fr/ontology</p> - <p>The distinction between disambiguating word sense and entities is - mainly in the different semantics: whereas word sense disambiguation - targets literal words and their senses on the lexical level, entity - disambiguation targets real-world concepts that are behind the - selected phrases on the conceptual level.</p> - <p>When serializing the ITS markup in HTML5, the preferred way is to - serialize in RDFa Lite or Microdata due to the existing search and - crawling infrastructure that is able to consume this kind of - data.</p> + <p>For referring to <att>disambigClassRef</att> values, implementors are encouraged to use an existing + repository of entity types as long as they satisfy their requirements. For example, + the Named Entity Recognition and Disambiguation ontology (NERD): http://nerd.eurecom.fr/ontology</p> + <p>Furthermore, valid target types depend on the disambiguation granularity: types of entities are distinct + from types of lexical concepts or ontology concepts. While this distinction exists, the specification does not prescribe + a way of automatically inferring a disambiguation level from a target type.</p> + <p>When serializing the ITS mark-up in HTML5, the preferred way is to serialize in RDFa Lite or Microdata due + to the existing search and crawling infrastructure that is able to consume this kind of data.</p> </note> <exemplum xml:id="EX-disambiguation-html5-rdfa"> - <head>Local mixed usage of <att>entityTypeSourceRef</att>, - <att>entityTypeRef</att>, <att>disambigSourceRef</att>, - <att>disambigIdentRef</att> in HTML+RDFa Lite</head> + <head>Local mixed usage of <att>entityTypeSourceRef</att>, <att>enttiyTypeRef</att>, <att>disambigSourceRef</att>, + <att>disambigIdentRef</att> in HTML+RDFa Lite.</head> <p>See <ptr target="#EX-disambiguation-html5-rdfa-companion-document" type="exref"/> for the companion document with the mapping data.</p> @@ -3326,11 +3286,8 @@ target="examples/html5/EX-disambiguation-html5-rdfa.html"/> </exemplum> <exemplum xml:id="EX-disambiguation-html5-rdfa-companion-document"> - <head>Local mixed usage of <att>entityTypeSourceRef</att>, - <att>entityTypeRef</att>, <att>disambigSourceRef</att>, - <att>disambigIdentRef</att> in HTML+RDFa Lite</head> - <p>Companion document, having the mapping data for <ptr - target="#EX-disambiguation-html5-rdfa" type="exref"/>.</p> + <head>Companion document, having the mapping data for <ptr + target="#EX-disambiguation-html5-rdfa" type="exref"/>.</head> <egXML xmlns="http://www.tei-c.org/ns/Examples" target="examples/html5/EX-disambiguation-html5-rdfa.xml"/> </exemplum> Index: its20.html =================================================================== RCS file: /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.html,v retrieving revision 1.188 retrieving revision 1.189 diff -u -d -r1.188 -r1.189 --- its20.html 15 Oct 2012 13:05:33 -0000 1.188 +++ its20.html 16 Oct 2012 13:17:07 -0000 1.189 @@ -2125,84 +2125,45 @@ 'auto' and 'medicine', but not 'law', since the extra training resources does not justify the improvement in the output.</p></div></div></div><div class="div2"> <h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="Disambiguation" id="Disambiguation" shape="rect"/>6.10 Disambiguation</h3><span class="editor-note">[Ed. note: This data category is not completely stable yet.]</span><div class="div3"> -<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="Disambiguation-definition" id="Disambiguation-definition" shape="rect"/>6.10.1 Definition</h4><p>The <a href="#Disambiguation" shape="rect">Disambiguation</a>ref> data category - is used to indicate occurrences of specific concepts that may require - special handling in the localization process.</p><p>This data category can be used for several purposes, including, but not - limited to:</p><ul><li><p>Informing translation systems that a fragment of text may be - subject to specific rules (e..g., concerning the translation of - proper names) or that it is has an official translation</p></li><li><p>Informing translation systems concerning the specific meaning of - phrases.</p></li><li><p>Informing content management and translation systems about the - type of underlying entity in order to enable processing based on a - specific type of the target, for example, personal names, product or - geographic names, chemical compounds, protein names, and so - forth.</p></li></ul><p>We introduce the following concepts:</p><ul><li><p>Entity Type Source: a domain of valid values, an identifier - collection for entity types. Unless specified, it will be derived by - default de-referencing mechanisms for the URI.</p></li><li><p>Entity Type: the type of the entity, being one of values within - the entity type source identifier collection.</p></li><li><p>Disambiguation type: the level of disambiguation (lexical concept, - ontology concept, entity). The disambiguation can happen at multiple - levels. For instance, the level of lexical concepts disambiguates - individual word surface forms, the level of ontology concepts - disambiguates into deeper semantics, and the entity disambiguation - works on the level of concrete instances. For instance, the word"<span class="quote">City</span>" in "<span class="quote">I am going to the City</span>" may - be disambiguated in one of the WordNet synsets that can be - represented by "<span class="quote">city</span>", an RDF ontology concept of a - City that could represent a subclass of a PopulatedPlace, or the - center area of a particular city, e.g. London City.</p></li><li><p>Disambiguation Source: the identifier collection source used for - locating the correct underlying identifier. It can be anything that - can representing a collection of identifiers for words, concepts or - entities, for instance, a knowledge base, an ontology or semantic - network. Unless specified, it will be derived by default - de-referencing mechanisms for the URI.</p></li><li><p>Disambiguation Identifier: an identifier, unique within the - current disambiguation identifier collection, specifying the actual - identifier (meaning, concept or entity) behind the selected - content.</p></li></ul><p>Two types of Disambiguation data categories are needed to identify:</p><ul><li><p>Entity type, which describes the type of the underlying entity - within a particular domain of types, as specified by the type source - identifier collection.</p></li><li><p>Disambiguation, which describes the actual underlying identifier - or meaning that the mention refers to, either in a knowledge base, - ontology or in a semantic network.</p></li></ul><p>Text analysis engines, such as named entity recognizers, named entity, - concept and word sense disambiguators can offer an easy way to create - this information. Content management tools can present and visualize - this information or use it to index their content. Machine translations - systems may use it for training and translation when dealing with proper - names and edge cases.</p></div><div class="div3"> +<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="Disambiguation-definition" id="Disambiguation-definition" shape="rect"/>6.10.1 Definition</h4><p>The <a href="#Disambiguation" shape="rect">Disambiguation</a> data category is used to + indicate occurrences of specific concepts that may require special + handling in the localization of the document.</p><p>This data category can be used for several purposes, including, but not + limited to:</p><ul><li><p>Informing translation systems that this fragment of text may not be literally translated, but subject to specific proper name translation rules or official translations, as well as a very specific meaning of the phrases.</p></li><li><p>Informing content management and translation systems about the type of the underlying entity in order to enable processing based on a specific type of the target, for example, when handling personal names, product names or geographic names, chemical compounds, protein names and similar.</p></li></ul><p>Disambiguation is achieved by associating a selected fragment of text with an external web resource that can be referenced + by a translation or linguistic review agent in order to access the correct meaning or lexical use of the text and thereby + informing its translation.</p><p>A fragment of text can be disambiguated at different granularities, i.e. as a lexical concept, as an ontology concept, + or as a named entity.</p><p>As a lexical concept, the external reference can provide synonyms and example usage, e.g. using + service such as Wordnet.</p><p>As an ontology concept, the external reference can provide a formal conceptual definition within + a framework of related concepts.</p><p>As a named entity, the external reference can provide a description of the real world entity the text intends + to convey. For instance, the word 'City' in 'I am going to the City' may be disambiguated in one of the WordNet + synsets that can be represented by 'city', an ontology concept of a City that could represent a subclass of a + “PopulatedPlace” in the conceptual granularity level, or the central area of a particular city, e.g. City of London, + as interpreted in the entity granularity level. Linked data network, such as DBpedia, increasing interlink ontological + and named entity definitions for the same things as authored in different languages, offering a mechanism to + locate translations from the source language description.</p><p>Two types of disambiguation are needed to identify:</p><ul><li><p>Disambiguation type class, which describes the type class of the underlying concept or entity of the fragment.</p></li><li><p>Disambiguation, which describes the actual underlying external resource that conveys the intended meaning of the fragment.</p></li></ul><p>Text analysis engines, such as named entity recognizers, named entity, concept and word sense disambiguation + components can offer an easy way to create this information. Content management tools can present and visualize + this information or use it to index their content. Machine translations systems may use it for training and translation + when dealing with proper names and edge cases.</p></div><div class="div3"> <h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="Disambiguation-implementation" id="Disambiguation-implementation" shape="rect"/>6.10.2 Implementation</h4><p>The <a href="#Disambiguation" shape="rect">Disambiguation</a> data category can be expressed with global rules, or locally on an individual element. The information applies to the textual content of the element. There is no - inheritance. The entity type follows inheritance rules.</p><p id="disambiguation-global">GLOBAL: The <code>disambiguationRule</code> + inheritance. The entity type follows inheritance rules.</p><span class="editor-note">[Ed. note: The two sentences above seem contradictory.]</span><p id="disambiguation-global">GLOBAL: The <code>disambiguationRule</code> element contains the following:</p><ul><li><p>A required <code>selector</code> attribute. It contains an <a href="#selectors" shape="rect">absolute selector</a> which selects the - nodes to which this rule applies.</p></li><li><p>An optional <code>entityTypeSourceRef</code> attribute that contains - a URI specifying the concrete identifier data source (knowledge - base, semantic network), used to determine the entity type.</p></li><li><p>An optional <code>entityTypeSourcePointer</code> attribute that - contains a relative XPath expression pointing to a node that - represents the identifier data source (knowledge base, semantic - network), used to determine the entity type.</p></li><li><p>An optional <code>entityTypeSourceRefPointer</code> attribute that - contains a relative XPath expression pointing to a node that holds - the URI that represents the identifier data source (knowledge base, - semantic network), used to determine the entity type.</p></li><li><p>An optional <code>entityTypeRef</code> attribute that contains a - URI, specifying the entity type behind the selector.</p></li><li><p>An optional <code>entityTypePointer</code> attribute that contains a - relative XPath expression pointing to a node specifying the entity - type behind the selector.</p></li><li><p>An optional <code>entityTypeRefPointer</code> attribute that - contains a relative XPath expression pointing to a node that holds - the URI that specifies the entity type behind the selector.</p></li><li><p>An optional <code>disambigType</code> attribute that contains a - string, specifying the specific semantics of the disambiguation. It - can be one of "lexicalConcept", "ontologyConcept", or - "entity".</p></li><li><p>An optional <code>disambigSourceRef</code> attribute. It contains a - URI representing the disambiguation identifier collection - source.</p></li><li><p>An optional <code>disambigSourcePointer</code> attribute. It - contains a relative XPath expression pointing to a node that - represents the disambiguation identifier collection source.</p></li><li><p>An optional <code>disambigSourceRefPointer</code> attribute. It - contains a relative XPath expression pointing to a node that holds - the URI that represents the disambiguation identifier collection - source.</p></li><li><p>An optional <code>disambigIdentRef</code> attribute. It contains a - URI that represents a unique identifier within the identifier - collection.</p></li><li><p>An optional <code>disambigIdentPointer</code> attribute. It contains - a relative XPath expression pointing to a node that represents a - unique identifier within the identifier collection.</p></li><li><p>An optional <code>disambigIdentRefPointer</code> attribute. It - contains a relative XPath expression pointing to a node that - represents a unique identifier within the identifier - collection.</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-global-1" id="EX-disambiguation-global-1" shape="rect"/>Example 52: Usage of <code>entityTypeSourceRef</code>, <code>enttiyTypeRef</code>, - <code>disambigSourceRef</code>, <code>disambigIdentRef</code> for - both entity and word sense disambiguation.</div><div class="exampleInner"><pre xml:space="preserve"> + nodes to which this rule applies.</p></li><li><p>None of exactly one of the following: + <ul><li><p>A <code>disambigClassRef</code> attribute that contains a URI, specifying the type class of the concept + or entity behind the selector.</p></li><li><p>A <code>disambigClassPointer</code> attribute that contains a <a href="#selectors" shape="rect">relative selector</a> + pointing to a node specifying the entity type class behind the selector.</p></li><li><p>A <code>disambigClassRefPointer</code> attribute that contains a <a href="#selectors" shape="rect">relative selector</a> + pointing to a node that holds a URI that specifies the entity type class behind the selector.</p></li></ul></p></li><li><p>An optional <code>disambigGranularity</code> attribute that contains a string, specifying the granularity + level of the disambiguation. The value can be one of the following identifiers: <code>lexicalConcept</code>, <code>ontologyConcept</code>, or <code>entity</code>.</p></li><li><p>An optional <code>disambigSource</code> attribute. It contains a string representing the disambiguation + identifier collection source.</p></li><li><p>None of exactly one of the following: + <ul><li><p>A <code>disambigIdent</code> attribute. It contains a string that represents the disambiguation + identifier for the disambiguation target that is valid within the specified Disambiguation Source.</p></li><li><p>A <code>disambigIdentRef</code> attribute. It contains an URI that represents a unique identifier + for the disambiguation target.</p></li><li><p>A <code>disambigIdentPointer</code> attribute. It contains a <a href="#selectors" shape="rect">relative selector</a> + pointing to a node that represents a unique identifier for the disambiguation target.</p></li><li><p>a <code>disambigIdentRefPointer</code> attribute. It contains a <a href="#selectors" shape="rect">relative selector</a> + pointing to a node that holds a URI that represents a unique identifier for the disambiguation target.</p></li></ul></p></li></ul><p>When using a disambiguation rule, the user <a href="#rfc2119" shape="rect">MUST</a> use one of the use cases for disambiguation: + specifying the target type, or specifying the target identity. + For the latter, the user <a href="#rfc2119" shape="rect">MUST</a> use only one of the two addressing modes:</p><ul><li><p>Using <code>disambigSource</code> and <code>disambigIdent</code> to specify the collection and the identifier itself.</p></li><li><p>Using one of <code>disambigIdentRef</code>, <code>disambigIdentPointer</code> or <code>disambigIdentRefPointer</code> using + a URI for the disambiguation target.</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-global-1" id="EX-disambiguation-global-1" shape="rect"/>Example 52: Usage of <code>entityTypeSourceRef</code>, <code>enttiyTypeRef</code>, <code>disambigSourceRef</code>, + <code>disambigIdentRef</code> for both entity and word sense disambiguation.</div><div class="exampleInner"><pre xml:space="preserve"> <text xmlns:its="http://www.w3.org/2005/11/its" > <its:rules version="2.0"> @@ -2225,19 +2186,14 @@ </body> </text></pre></div><p>[Source file: <a href="examples/xml/EX-disambiguation-global-1.xml" shape="rect">examples/xml/EX-disambiguation-global-1.xml</a>]</p></div><p id="disambiguation-local">LOCAL: The following local markup is available for the <a href="#Disambiguation" shape="rect">Disambiguation</a> - data category:</p><ul><li><p>An optional <code>entityTypeSourceRef</code> attribute that contains - an URI specifying the concrete identifier data source (knowledge - base, semantic network), used to determine the entity type.</p></li><li><p>An optional <code>entityTypeRef</code> attribute that contains a URI - specifying the entity type behind the selector.</p></li><li><p>An optional <code>disambigType</code> attribute that contains a - string, specifying the specific semantics of the disambiguation. It - can be one of "lexicalConcept", "ontologyConcept", or - "entity".</p></li><li><p>An optional <code>disambigSourceRef</code> attribute. It contains a - URI representing the disambiguation identifier collection - source.</p></li><li><p>An optional <code>disambigIdentRef</code> attribute. It contains a - URI that represents a unique identifier within the identifier - collection.</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-local-1" id="EX-disambiguation-html5-local-1" shape="rect"/>Example 53: Local mixed usage of <code>entityTypeSourceRef</code>, - <code>enttiyTypeRef</code>, <code>disambigSourceRef</code>, - <code>disambigIdentRef</code> in HTML.</div><div class="exampleInner"><pre xml:space="preserve"><!DOCTYPE html>
 + data category:</p><ul><li><p>An optional <code>disambigClassRef</code> attribute that contains a URI, specifying the type class + of the concept or entity behind the selector.</p></li><li><p>An optional <code>disambigGranularity</code> attribute that contains a string, specifying the + granularity level of the disambiguation. The value can be one of the following identifiers: <code>lexicalConcept</code>, <code>ontologyConcept</code>, or <code>entity</code></p></li><li><p>An optional <code>disambigSource</code> attribute. It contains a string representing the + disambiguation identifier collection source.</p></li><li><p>An optional <code>disambigIdent</code> attribute. It contains a string, representing the + disambiguation identifier for the disambiguation target that is valid within the specified Disambiguation Source.</p></li><li><p>An optional <code>disambigIdentRef</code> attribute. It contains a URI that represents a unique + identifier for the disambiguation target.</p></li></ul><p>The user <a href="#rfc2119" shape="rect">MUST</a> use only one of the two addressing modes for disambiguation:</p><ul><li><p>Using <code>disambigSource</code> and <code>disambigIdent</code> to specify the collection + and the identifier itself.</p></li><li><p>Using <code>disambigIdentRef</code> using a URI for the disambiguation target</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-local-1" id="EX-disambiguation-html5-local-1" shape="rect"/>Example 53: Local mixed usage of <code>entityTypeSourceRef</code>, <code>enttiyTypeRef</code>, + <code>disambigSourceRef</code>, <code>disambigIdentRef</code> in HTML.</div><div class="exampleInner"><pre xml:space="preserve"><!DOCTYPE html>
 <html lang=en> <head> <meta charset=utf-8> @@ -2255,20 +2211,13 @@ its-disambig-source-ref=http://www.w3.org/2006/03/wn/wn20/rdf/wordnet-synset.rdf its-disambig-type=lexicalConcept>capital</span> of Ireland.</p> </body> - </html></pre></div><p>[Source file: <a href="examples/html5/EX-disambiguation-html5-local-1.html" shape="rect">examples/html5/EX-disambiguation-html5-local-1.html</a>]</p></div><div class="note"><p class="prefix"><b>Note:</b></p><p>While the <code>entityTypeSourceRef</code> attribute allows for an - arbitrary domain of entity types, the implementors are encouraged to - use an existing repository of entity types as long as they satisfy - their requirements. For example, the Named Entity Recognition and - Disambiguation ontology (NERD): http://nerd.eurecom.fr/ontology</p><p>The distinction between disambiguating word sense and entities is - mainly in the different semantics: whereas word sense disambiguation - targets literal words and their senses on the lexical level, entity - disambiguation targets real-world concepts that are behind the - selected phrases on the conceptual level.</p><p>When serializing the ITS markup in HTML5, the preferred way is to - serialize in RDFa Lite or Microdata due to the existing search and - crawling infrastructure that is able to consume this kind of - data.</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-rdfa" id="EX-disambiguation-html5-rdfa" shape="rect"/>Example 54: Local mixed usage of <code>entityTypeSourceRef</code>, - <code>entityTypeRef</code>, <code>disambigSourceRef</code>, - <code>disambigIdentRef</code> in HTML+RDFa Lite</div><p>See <a href="#EX-disambiguation-html5-rdfa-companion-document" shape="rect">Example 55</a> for the companion document with the mapping + </html></pre></div><p>[Source file: <a href="examples/html5/EX-disambiguation-html5-local-1.html" shape="rect">examples/html5/EX-disambiguation-html5-local-1.html</a>]</p></div><div class="note"><p class="prefix"><b>Note:</b></p><p>For referring to <code>disambigClassRef</code> values, implementors are encouraged to use an existing + repository of entity types as long as they satisfy their requirements. For example, + the Named Entity Recognition and Disambiguation ontology (NERD): http://nerd.eurecom.fr/ontology</p><p>Furthermore, valid target types depend on the disambiguation granularity: types of entities are distinct + from types of lexical concepts or ontology concepts. While this distinction exists, the specification does not prescribe + a way of automatically inferring a disambiguation level from a target type.</p><p>When serializing the ITS mark-up in HTML5, the preferred way is to serialize in RDFa Lite or Microdata due + to the existing search and crawling infrastructure that is able to consume this kind of data.</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-rdfa" id="EX-disambiguation-html5-rdfa" shape="rect"/>Example 54: Local mixed usage of <code>entityTypeSourceRef</code>, <code>enttiyTypeRef</code>, <code>disambigSourceRef</code>, + <code>disambigIdentRef</code> in HTML+RDFa Lite.</div><p>See <a href="#EX-disambiguation-html5-rdfa-companion-document" shape="rect">Example 55</a> for the companion document with the mapping data.</p><div class="exampleInner"><pre xml:space="preserve"><!DOCTYPE html>
 <html lang=en> <head> @@ -2279,9 +2228,7 @@ <p> <span property=name resource=http://dbpedia.org/resource/Dublin typeof=http:/nerd.eurecom.fr/ontology#Place>Dublin</span> is the capital of Ireland.</p> </body> - </html></pre></div><p>[Source file: <a href="examples/html5/EX-disambiguation-html5-rdfa.html" shape="rect">examples/html5/EX-disambiguation-html5-rdfa.html</a>]</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-rdfa-companion-document" id="EX-disambiguation-html5-rdfa-companion-document" shape="rect"/>Example 55: Local mixed usage of <code>entityTypeSourceRef</code>, - <code>entityTypeRef</code>, <code>disambigSourceRef</code>, - <code>disambigIdentRef</code> in HTML+RDFa Lite</div><p>Companion document, having the mapping data for <a href="#EX-disambiguation-html5-rdfa" shape="rect">Example 54</a>.</p><div class="exampleInner"><pre xml:space="preserve"> + </html></pre></div><p>[Source file: <a href="examples/html5/EX-disambiguation-html5-rdfa.html" shape="rect">examples/html5/EX-disambiguation-html5-rdfa.html</a>]</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-rdfa-companion-document" id="EX-disambiguation-html5-rdfa-companion-document" shape="rect"/>Example 55: Companion document, having the mapping data for <a href="#EX-disambiguation-html5-rdfa" shape="rect">Example 54</a>.</div><div class="exampleInner"><pre xml:space="preserve"> <its:rules xmlns:its="http://www.w3.org/2005/11/its" version="2.0"> <its:disambiguationRule selector="//*[@typeof]" entityTypeRefPointer="@typeof"/> @@ -3765,7 +3712,7 @@ <em>This section is informative.</em> </p><p>Several constraints of ITS markup cannot be validated with ITS schemas. The following <a title="Rule-based validation
							-- Schematron" href="#schematron" shape="rect">[Schematron]</a> document allows for - validating some of these constraints.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e9185" id="d3e9185" shape="rect"/>Example 97: Testing constraints in ITS markup</div><div class="exampleInner"><pre xml:space="preserve"> + validating some of these constraints.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e9219" id="d3e9219" shape="rect"/>Example 97: Testing constraints in ITS markup</div><div class="exampleInner"><pre xml:space="preserve"> <sch:schema xmlns:sch="http://www.ascc.net/xml/schematron" > <!-- Schematron document to test constraints for global and local ITS markup. @@ -3833,7 +3780,7 @@ </p><p>The following <a title="Namespace-based Validation
							Dispatching Language (NVDL)" href="#nvdl" shape="rect">[NVDL]</a> document allows validation of ITS markup which has been added to a host vocabulary. Only ITS elements and attributes are checked. Elements and attributes of host language are ignored - during validation against this NVDL document/schema.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e9207" id="d3e9207" shape="rect"/>Example 98: NVDL schema for ITS</div><div class="exampleInner"><pre xml:space="preserve"> + during validation against this NVDL document/schema.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e9241" id="d3e9241" shape="rect"/>Example 98: NVDL schema for ITS</div><div class="exampleInner"><pre xml:space="preserve"> <nvdl:rules xmlns:nvdl="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" > <nvdl:namespace ns="http://www.w3.org/2005/11/its">
Received on Tuesday, 16 October 2012 13:17:15 UTC