- From: Yves Savourel via cvs-syncmail <cvsmail@w3.org>
- Date: Tue, 16 Oct 2012 13:17:09 +0000
- To: public-multilingualweb-lt-commits@w3.org
Update of /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20
In directory hutz:/tmp/cvs-serv7132
Modified Files:
its20.html its20.odd
Log Message:
Implemented Disambiguation updates
Index: its20.odd
===================================================================
RCS file: /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.odd,v
retrieving revision 1.185
retrieving revision 1.186
diff -u -d -r1.185 -r1.186
--- its20.odd 15 Oct 2012 13:05:33 -0000 1.185
+++ its20.odd 16 Oct 2012 13:17:07 -0000 1.186
@@ -3146,67 +3146,41 @@
<note type="ed">This data category is not completely stable yet.</note>
<div xml:id="Disambiguation-definition">
<head>Definition</head>
- <p>The <ref target="#Disambiguation">Disambiguation</ref>ref> data category
- is used to indicate occurrences of specific concepts that may require
- special handling in the localization process.</p>
+ <p>The <ref target="#Disambiguation">Disambiguation</ref> data category is used to
+ indicate occurrences of specific concepts that may require special
+ handling in the localization of the document.</p>
<p>This data category can be used for several purposes, including, but not
limited to:</p>
<list type="unordered">
- <item>Informing translation systems that a fragment of text may be
- subject to specific rules (e..g., concerning the translation of
- proper names) or that it is has an official translation</item>
- <item>Informing translation systems concerning the specific meaning of
- phrases.</item>
- <item>Informing content management and translation systems about the
- type of underlying entity in order to enable processing based on a
- specific type of the target, for example, personal names, product or
- geographic names, chemical compounds, protein names, and so
- forth.</item>
- </list>
- <p>We introduce the following concepts:</p>
- <list type="unordered">
- <item>Entity Type Source: a domain of valid values, an identifier
- collection for entity types. Unless specified, it will be derived by
- default de-referencing mechanisms for the URI.</item>
- <item>Entity Type: the type of the entity, being one of values within
- the entity type source identifier collection.</item>
- <item>Disambiguation type: the level of disambiguation (lexical concept,
- ontology concept, entity). The disambiguation can happen at multiple
- levels. For instance, the level of lexical concepts disambiguates
- individual word surface forms, the level of ontology concepts
- disambiguates into deeper semantics, and the entity disambiguation
- works on the level of concrete instances. For instance, the word
- <quote>City</quote> in <quote>I am going to the City</quote> may
- be disambiguated in one of the WordNet synsets that can be
- represented by <quote>city</quote>, an RDF ontology concept of a
- City that could represent a subclass of a PopulatedPlace, or the
- center area of a particular city, e.g. London City.</item>
- <item>Disambiguation Source: the identifier collection source used for
- locating the correct underlying identifier. It can be anything that
- can representing a collection of identifiers for words, concepts or
- entities, for instance, a knowledge base, an ontology or semantic
- network. Unless specified, it will be derived by default
- de-referencing mechanisms for the URI.</item>
- <item>Disambiguation Identifier: an identifier, unique within the
- current disambiguation identifier collection, specifying the actual
- identifier (meaning, concept or entity) behind the selected
- content.</item>
+ <item>Informing translation systems that this fragment of text may not be literally translated, but subject to specific proper name translation rules or official translations, as well as a very specific meaning of the phrases.</item>
+ <item>Informing content management and translation systems about the type of the underlying entity in order to enable processing based on a specific type of the target, for example, when handling personal names, product names or geographic names, chemical compounds, protein names and similar.</item>
</list>
- <p>Two types of Disambiguation data categories are needed to identify:</p>
+ <p>Disambiguation is achieved by associating a selected fragment of text with an external web resource that can be referenced
+ by a translation or linguistic review agent in order to access the correct meaning or lexical use of the text and thereby
+ informing its translation.</p>
+ <p>A fragment of text can be disambiguated at different granularities, i.e. as a lexical concept, as an ontology concept,
+ or as a named entity.</p>
+ <p>As a lexical concept, the external reference can provide synonyms and example usage, e.g. using
+ service such as Wordnet.</p>
+ <p>As an ontology concept, the external reference can provide a formal conceptual definition within
+ a framework of related concepts.</p>
+ <p>As a named entity, the external reference can provide a description of the real world entity the text intends
+ to convey. For instance, the word 'City' in 'I am going to the City' may be disambiguated in one of the WordNet
+ synsets that can be represented by 'city', an ontology concept of a City that could represent a subclass of a
+ “PopulatedPlace” in the conceptual granularity level, or the central area of a particular city, e.g. City of London,
+ as interpreted in the entity granularity level. Linked data network, such as DBpedia, increasing interlink ontological
+ and named entity definitions for the same things as authored in different languages, offering a mechanism to
+ locate translations from the source language description.</p>
+
+ <p>Two types of disambiguation are needed to identify:</p>
<list type="unordered">
- <item>Entity type, which describes the type of the underlying entity
- within a particular domain of types, as specified by the type source
- identifier collection.</item>
- <item>Disambiguation, which describes the actual underlying identifier
- or meaning that the mention refers to, either in a knowledge base,
- ontology or in a semantic network.</item>
+ <item>Disambiguation type class, which describes the type class of the underlying concept or entity of the fragment.</item>
+ <item>Disambiguation, which describes the actual underlying external resource that conveys the intended meaning of the fragment.</item>
</list>
- <p>Text analysis engines, such as named entity recognizers, named entity,
- concept and word sense disambiguators can offer an easy way to create
- this information. Content management tools can present and visualize
- this information or use it to index their content. Machine translations
- systems may use it for training and translation when dealing with proper
- names and edge cases.</p>
+ <p>Text analysis engines, such as named entity recognizers, named entity, concept and word sense disambiguation
+ components can offer an easy way to create this information. Content management tools can present and visualize
+ this information or use it to index their content. Machine translations systems may use it for training and translation
+ when dealing with proper names and edge cases.</p>
</div>
<div xml:id="Disambiguation-implementation">
<head>Implementation</head>
@@ -3214,60 +3188,52 @@
be expressed with global rules, or locally on an individual element. The
information applies to the textual content of the element. There is no
inheritance. The entity type follows inheritance rules.</p>
+ <note type="ed">The two sentences above seem contradictory.</note>
<p xml:id="disambiguation-global">GLOBAL: The <gi>disambiguationRule</gi>
element contains the following:</p>
<list type="unordered">
<item>A required <att>selector</att> attribute. It contains an <ref
target="#selectors">absolute selector</ref> which selects the
nodes to which this rule applies.</item>
- <item>An optional <att>entityTypeSourceRef</att> attribute that contains
- a URI specifying the concrete identifier data source (knowledge
- base, semantic network), used to determine the entity type.</item>
- <item>An optional <att>entityTypeSourcePointer</att> attribute that
- contains a relative XPath expression pointing to a node that
- represents the identifier data source (knowledge base, semantic
- network), used to determine the entity type.</item>
- <item>An optional <att>entityTypeSourceRefPointer</att> attribute that
- contains a relative XPath expression pointing to a node that holds
- the URI that represents the identifier data source (knowledge base,
- semantic network), used to determine the entity type.</item>
- <item>An optional <att>entityTypeRef</att> attribute that contains a
- URI, specifying the entity type behind the selector.</item>
- <item>An optional <att>entityTypePointer</att> attribute that contains a
- relative XPath expression pointing to a node specifying the entity
- type behind the selector.</item>
- <item>An optional <att>entityTypeRefPointer</att> attribute that
- contains a relative XPath expression pointing to a node that holds
- the URI that specifies the entity type behind the selector.</item>
- <item>An optional <att>disambigType</att> attribute that contains a
- string, specifying the specific semantics of the disambiguation. It
- can be one of "lexicalConcept", "ontologyConcept", or
- "entity".</item>
- <item>An optional <att>disambigSourceRef</att> attribute. It contains a
- URI representing the disambiguation identifier collection
- source.</item>
- <item>An optional <att>disambigSourcePointer</att> attribute. It
- contains a relative XPath expression pointing to a node that
- represents the disambiguation identifier collection source.</item>
- <item>An optional <att>disambigSourceRefPointer</att> attribute. It
- contains a relative XPath expression pointing to a node that holds
- the URI that represents the disambiguation identifier collection
- source.</item>
- <item>An optional <att>disambigIdentRef</att> attribute. It contains a
- URI that represents a unique identifier within the identifier
- collection.</item>
- <item>An optional <att>disambigIdentPointer</att> attribute. It contains
- a relative XPath expression pointing to a node that represents a
- unique identifier within the identifier collection.</item>
- <item>An optional <att>disambigIdentRefPointer</att> attribute. It
- contains a relative XPath expression pointing to a node that
- represents a unique identifier within the identifier
- collection.</item>
+ <item>None of exactly one of the following:
+ <list>
+ <item>A <att>disambigClassRef</att> attribute that contains a URI, specifying the type class of the concept
+ or entity behind the selector.</item>
+ <item>A <att>disambigClassPointer</att> attribute that contains a <ref target="#selectors">relative selector</ref>
+ pointing to a node specifying the entity type class behind the selector.</item>
+ <item>A <att>disambigClassRefPointer</att> attribute that contains a <ref target="#selectors">relative selector</ref>
+ pointing to a node that holds a URI that specifies the entity type class behind the selector.</item>
+ </list>
+ </item>
+ <item>An optional <att>disambigGranularity</att> attribute that contains a string, specifying the granularity
+ level of the disambiguation. The value can be one of the following identifiers:
+ <code>lexicalConcept</code>, <code>ontologyConcept</code>, or <code>entity</code>.</item>
+ <item>An optional <att>disambigSource</att> attribute. It contains a string representing the disambiguation
+ identifier collection source.</item>
+ <item>None of exactly one of the following:
+ <list>
+ <item>A <att>disambigIdent</att> attribute. It contains a string that represents the disambiguation
+ identifier for the disambiguation target that is valid within the specified Disambiguation Source.</item>
+ <item>A <att>disambigIdentRef</att> attribute. It contains an URI that represents a unique identifier
+ for the disambiguation target.</item>
+ <item>A <att>disambigIdentPointer</att> attribute. It contains a <ref target="#selectors">relative selector</ref>
+ pointing to a node that represents a unique identifier for the disambiguation target.</item>
+ <item>a <att>disambigIdentRefPointer</att> attribute. It contains a <ref target="#selectors">relative selector</ref>
+ pointing to a node that holds a URI that represents a unique identifier for the disambiguation target.</item>
+ </list>
+ </item>
+ </list>
+ <p>When using a disambiguation rule, the user <ref target="#rfc2119">MUST</ref> use one of the use cases for disambiguation:
+ specifying the target type, or specifying the target identity.
+ For the latter, the user <ref target="#rfc2119">MUST</ref> use only one of the two addressing modes:</p>
+ <list>
+ <item>Using <att>disambigSource</att> and <att>disambigIdent</att> to specify the collection and the identifier itself.</item>
+ <item>Using one of <att>disambigIdentRef</att>, <att>disambigIdentPointer</att> or <att>disambigIdentRefPointer</att> using
+ a URI for the disambiguation target.</item>
</list>
<exemplum xml:id="EX-disambiguation-global-1">
- <head>Usage of <att>entityTypeSourceRef</att>, <att>enttiyTypeRef</att>,
- <att>disambigSourceRef</att>, <att>disambigIdentRef</att> for
- both entity and word sense disambiguation.</head>
+ <head>Usage of <att>entityTypeSourceRef</att>, <att>enttiyTypeRef</att>, <att>disambigSourceRef</att>,
+ <att>disambigIdentRef</att> for both entity and word sense disambiguation.</head>
<egXML xmlns="http://www.tei-c.org/ns/Examples"
target="examples/xml/EX-disambiguation-global-1.xml"/>
</exemplum>
@@ -3276,49 +3242,43 @@
available for the <ref target="#Disambiguation">Disambiguation</ref>
data category:</p>
<list type="unordered">
- <item>An optional <att>entityTypeSourceRef</att> attribute that contains
- an URI specifying the concrete identifier data source (knowledge
- base, semantic network), used to determine the entity type.</item>
- <item>An optional <att>entityTypeRef</att> attribute that contains a URI
- specifying the entity type behind the selector.</item>
- <item>An optional <att>disambigType</att> attribute that contains a
- string, specifying the specific semantics of the disambiguation. It
- can be one of "lexicalConcept", "ontologyConcept", or
- "entity".</item>
- <item>An optional <att>disambigSourceRef</att> attribute. It contains a
- URI representing the disambiguation identifier collection
- source.</item>
- <item>An optional <att>disambigIdentRef</att> attribute. It contains a
- URI that represents a unique identifier within the identifier
- collection.</item>
+ <item>An optional <att>disambigClassRef</att> attribute that contains a URI, specifying the type class
+ of the concept or entity behind the selector.</item>
+ <item>An optional <att>disambigGranularity</att> attribute that contains a string, specifying the
+ granularity level of the disambiguation. The value can be one of the following identifiers:
+ <code>lexicalConcept</code>, <code>ontologyConcept</code>, or <code>entity</code></item>
+ <item>An optional <att>disambigSource</att> attribute. It contains a string representing the
+ disambiguation identifier collection source.</item>
+ <item>An optional <att>disambigIdent</att> attribute. It contains a string, representing the
+ disambiguation identifier for the disambiguation target that is valid within the specified Disambiguation Source.</item>
+ <item>An optional <att>disambigIdentRef</att> attribute. It contains a URI that represents a unique
+ identifier for the disambiguation target.</item>
+ </list>
+ <p>The user <ref target="#rfc2119">MUST</ref> use only one of the two addressing modes for disambiguation:</p>
+ <list>
+ <item>Using <att>disambigSource</att> and <att>disambigIdent</att> to specify the collection
+ and the identifier itself.</item>
+ <item>Using <att>disambigIdentRef</att> using a URI for the disambiguation target</item>
</list>
<exemplum xml:id="EX-disambiguation-html5-local-1">
- <head>Local mixed usage of <att>entityTypeSourceRef</att>,
- <att>enttiyTypeRef</att>, <att>disambigSourceRef</att>,
- <att>disambigIdentRef</att> in HTML.</head>
+ <head>Local mixed usage of <att>entityTypeSourceRef</att>, <att>enttiyTypeRef</att>,
+ <att>disambigSourceRef</att>, <att>disambigIdentRef</att> in HTML.</head>
<egXML xmlns="http://www.tei-c.org/ns/Examples" type="html5"
target="examples/html5/EX-disambiguation-html5-local-1.html"/>
</exemplum>
<note>
- <p>While the <att>entityTypeSourceRef</att> attribute allows for an
- arbitrary domain of entity types, the implementors are encouraged to
- use an existing repository of entity types as long as they satisfy
- their requirements. For example, the Named Entity Recognition and
- Disambiguation ontology (NERD): http://nerd.eurecom.fr/ontology</p>
- <p>The distinction between disambiguating word sense and entities is
- mainly in the different semantics: whereas word sense disambiguation
- targets literal words and their senses on the lexical level, entity
- disambiguation targets real-world concepts that are behind the
- selected phrases on the conceptual level.</p>
- <p>When serializing the ITS markup in HTML5, the preferred way is to
- serialize in RDFa Lite or Microdata due to the existing search and
- crawling infrastructure that is able to consume this kind of
- data.</p>
+ <p>For referring to <att>disambigClassRef</att> values, implementors are encouraged to use an existing
+ repository of entity types as long as they satisfy their requirements. For example,
+ the Named Entity Recognition and Disambiguation ontology (NERD): http://nerd.eurecom.fr/ontology</p>
+ <p>Furthermore, valid target types depend on the disambiguation granularity: types of entities are distinct
+ from types of lexical concepts or ontology concepts. While this distinction exists, the specification does not prescribe
+ a way of automatically inferring a disambiguation level from a target type.</p>
+ <p>When serializing the ITS mark-up in HTML5, the preferred way is to serialize in RDFa Lite or Microdata due
+ to the existing search and crawling infrastructure that is able to consume this kind of data.</p>
</note>
<exemplum xml:id="EX-disambiguation-html5-rdfa">
- <head>Local mixed usage of <att>entityTypeSourceRef</att>,
- <att>entityTypeRef</att>, <att>disambigSourceRef</att>,
- <att>disambigIdentRef</att> in HTML+RDFa Lite</head>
+ <head>Local mixed usage of <att>entityTypeSourceRef</att>, <att>enttiyTypeRef</att>, <att>disambigSourceRef</att>,
+ <att>disambigIdentRef</att> in HTML+RDFa Lite.</head>
<p>See <ptr target="#EX-disambiguation-html5-rdfa-companion-document"
type="exref"/> for the companion document with the mapping
data.</p>
@@ -3326,11 +3286,8 @@
target="examples/html5/EX-disambiguation-html5-rdfa.html"/>
</exemplum>
<exemplum xml:id="EX-disambiguation-html5-rdfa-companion-document">
- <head>Local mixed usage of <att>entityTypeSourceRef</att>,
- <att>entityTypeRef</att>, <att>disambigSourceRef</att>,
- <att>disambigIdentRef</att> in HTML+RDFa Lite</head>
- <p>Companion document, having the mapping data for <ptr
- target="#EX-disambiguation-html5-rdfa" type="exref"/>.</p>
+ <head>Companion document, having the mapping data for <ptr
+ target="#EX-disambiguation-html5-rdfa" type="exref"/>.</head>
<egXML xmlns="http://www.tei-c.org/ns/Examples"
target="examples/html5/EX-disambiguation-html5-rdfa.xml"/>
</exemplum>
Index: its20.html
===================================================================
RCS file: /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.html,v
retrieving revision 1.188
retrieving revision 1.189
diff -u -d -r1.188 -r1.189
--- its20.html 15 Oct 2012 13:05:33 -0000 1.188
+++ its20.html 16 Oct 2012 13:17:07 -0000 1.189
@@ -2125,84 +2125,45 @@
'auto' and 'medicine', but not 'law', since the extra training
resources does not justify the improvement in the output.</p></div></div></div><div class="div2">
<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="Disambiguation" id="Disambiguation" shape="rect"/>6.10 Disambiguation</h3><span class="editor-note">[Ed. note: This data category is not completely stable yet.]</span><div class="div3">
-<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="Disambiguation-definition" id="Disambiguation-definition" shape="rect"/>6.10.1 Definition</h4><p>The <a href="#Disambiguation" shape="rect">Disambiguation</a>ref> data category
- is used to indicate occurrences of specific concepts that may require
- special handling in the localization process.</p><p>This data category can be used for several purposes, including, but not
- limited to:</p><ul><li><p>Informing translation systems that a fragment of text may be
- subject to specific rules (e..g., concerning the translation of
- proper names) or that it is has an official translation</p></li><li><p>Informing translation systems concerning the specific meaning of
- phrases.</p></li><li><p>Informing content management and translation systems about the
- type of underlying entity in order to enable processing based on a
- specific type of the target, for example, personal names, product or
- geographic names, chemical compounds, protein names, and so
- forth.</p></li></ul><p>We introduce the following concepts:</p><ul><li><p>Entity Type Source: a domain of valid values, an identifier
- collection for entity types. Unless specified, it will be derived by
- default de-referencing mechanisms for the URI.</p></li><li><p>Entity Type: the type of the entity, being one of values within
- the entity type source identifier collection.</p></li><li><p>Disambiguation type: the level of disambiguation (lexical concept,
- ontology concept, entity). The disambiguation can happen at multiple
- levels. For instance, the level of lexical concepts disambiguates
- individual word surface forms, the level of ontology concepts
- disambiguates into deeper semantics, and the entity disambiguation
- works on the level of concrete instances. For instance, the word"<span class="quote">City</span>" in "<span class="quote">I am going to the City</span>" may
- be disambiguated in one of the WordNet synsets that can be
- represented by "<span class="quote">city</span>", an RDF ontology concept of a
- City that could represent a subclass of a PopulatedPlace, or the
- center area of a particular city, e.g. London City.</p></li><li><p>Disambiguation Source: the identifier collection source used for
- locating the correct underlying identifier. It can be anything that
- can representing a collection of identifiers for words, concepts or
- entities, for instance, a knowledge base, an ontology or semantic
- network. Unless specified, it will be derived by default
- de-referencing mechanisms for the URI.</p></li><li><p>Disambiguation Identifier: an identifier, unique within the
- current disambiguation identifier collection, specifying the actual
- identifier (meaning, concept or entity) behind the selected
- content.</p></li></ul><p>Two types of Disambiguation data categories are needed to identify:</p><ul><li><p>Entity type, which describes the type of the underlying entity
- within a particular domain of types, as specified by the type source
- identifier collection.</p></li><li><p>Disambiguation, which describes the actual underlying identifier
- or meaning that the mention refers to, either in a knowledge base,
- ontology or in a semantic network.</p></li></ul><p>Text analysis engines, such as named entity recognizers, named entity,
- concept and word sense disambiguators can offer an easy way to create
- this information. Content management tools can present and visualize
- this information or use it to index their content. Machine translations
- systems may use it for training and translation when dealing with proper
- names and edge cases.</p></div><div class="div3">
+<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="Disambiguation-definition" id="Disambiguation-definition" shape="rect"/>6.10.1 Definition</h4><p>The <a href="#Disambiguation" shape="rect">Disambiguation</a> data category is used to
+ indicate occurrences of specific concepts that may require special
+ handling in the localization of the document.</p><p>This data category can be used for several purposes, including, but not
+ limited to:</p><ul><li><p>Informing translation systems that this fragment of text may not be literally translated, but subject to specific proper name translation rules or official translations, as well as a very specific meaning of the phrases.</p></li><li><p>Informing content management and translation systems about the type of the underlying entity in order to enable processing based on a specific type of the target, for example, when handling personal names, product names or geographic names, chemical compounds, protein names and similar.</p></li></ul><p>Disambiguation is achieved by associating a selected fragment of text with an external web resource that can be referenced
+ by a translation or linguistic review agent in order to access the correct meaning or lexical use of the text and thereby
+ informing its translation.</p><p>A fragment of text can be disambiguated at different granularities, i.e. as a lexical concept, as an ontology concept,
+ or as a named entity.</p><p>As a lexical concept, the external reference can provide synonyms and example usage, e.g. using
+ service such as Wordnet.</p><p>As an ontology concept, the external reference can provide a formal conceptual definition within
+ a framework of related concepts.</p><p>As a named entity, the external reference can provide a description of the real world entity the text intends
+ to convey. For instance, the word 'City' in 'I am going to the City' may be disambiguated in one of the WordNet
+ synsets that can be represented by 'city', an ontology concept of a City that could represent a subclass of a
+ “PopulatedPlace” in the conceptual granularity level, or the central area of a particular city, e.g. City of London,
+ as interpreted in the entity granularity level. Linked data network, such as DBpedia, increasing interlink ontological
+ and named entity definitions for the same things as authored in different languages, offering a mechanism to
+ locate translations from the source language description.</p><p>Two types of disambiguation are needed to identify:</p><ul><li><p>Disambiguation type class, which describes the type class of the underlying concept or entity of the fragment.</p></li><li><p>Disambiguation, which describes the actual underlying external resource that conveys the intended meaning of the fragment.</p></li></ul><p>Text analysis engines, such as named entity recognizers, named entity, concept and word sense disambiguation
+ components can offer an easy way to create this information. Content management tools can present and visualize
+ this information or use it to index their content. Machine translations systems may use it for training and translation
+ when dealing with proper names and edge cases.</p></div><div class="div3">
<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="Disambiguation-implementation" id="Disambiguation-implementation" shape="rect"/>6.10.2 Implementation</h4><p>The <a href="#Disambiguation" shape="rect">Disambiguation</a> data category can
be expressed with global rules, or locally on an individual element. The
information applies to the textual content of the element. There is no
- inheritance. The entity type follows inheritance rules.</p><p id="disambiguation-global">GLOBAL: The <code>disambiguationRule</code>
+ inheritance. The entity type follows inheritance rules.</p><span class="editor-note">[Ed. note: The two sentences above seem contradictory.]</span><p id="disambiguation-global">GLOBAL: The <code>disambiguationRule</code>
element contains the following:</p><ul><li><p>A required <code>selector</code> attribute. It contains an <a href="#selectors" shape="rect">absolute selector</a> which selects the
- nodes to which this rule applies.</p></li><li><p>An optional <code>entityTypeSourceRef</code> attribute that contains
- a URI specifying the concrete identifier data source (knowledge
- base, semantic network), used to determine the entity type.</p></li><li><p>An optional <code>entityTypeSourcePointer</code> attribute that
- contains a relative XPath expression pointing to a node that
- represents the identifier data source (knowledge base, semantic
- network), used to determine the entity type.</p></li><li><p>An optional <code>entityTypeSourceRefPointer</code> attribute that
- contains a relative XPath expression pointing to a node that holds
- the URI that represents the identifier data source (knowledge base,
- semantic network), used to determine the entity type.</p></li><li><p>An optional <code>entityTypeRef</code> attribute that contains a
- URI, specifying the entity type behind the selector.</p></li><li><p>An optional <code>entityTypePointer</code> attribute that contains a
- relative XPath expression pointing to a node specifying the entity
- type behind the selector.</p></li><li><p>An optional <code>entityTypeRefPointer</code> attribute that
- contains a relative XPath expression pointing to a node that holds
- the URI that specifies the entity type behind the selector.</p></li><li><p>An optional <code>disambigType</code> attribute that contains a
- string, specifying the specific semantics of the disambiguation. It
- can be one of "lexicalConcept", "ontologyConcept", or
- "entity".</p></li><li><p>An optional <code>disambigSourceRef</code> attribute. It contains a
- URI representing the disambiguation identifier collection
- source.</p></li><li><p>An optional <code>disambigSourcePointer</code> attribute. It
- contains a relative XPath expression pointing to a node that
- represents the disambiguation identifier collection source.</p></li><li><p>An optional <code>disambigSourceRefPointer</code> attribute. It
- contains a relative XPath expression pointing to a node that holds
- the URI that represents the disambiguation identifier collection
- source.</p></li><li><p>An optional <code>disambigIdentRef</code> attribute. It contains a
- URI that represents a unique identifier within the identifier
- collection.</p></li><li><p>An optional <code>disambigIdentPointer</code> attribute. It contains
- a relative XPath expression pointing to a node that represents a
- unique identifier within the identifier collection.</p></li><li><p>An optional <code>disambigIdentRefPointer</code> attribute. It
- contains a relative XPath expression pointing to a node that
- represents a unique identifier within the identifier
- collection.</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-global-1" id="EX-disambiguation-global-1" shape="rect"/>Example 52: Usage of <code>entityTypeSourceRef</code>, <code>enttiyTypeRef</code>,
- <code>disambigSourceRef</code>, <code>disambigIdentRef</code> for
- both entity and word sense disambiguation.</div><div class="exampleInner"><pre xml:space="preserve">
+ nodes to which this rule applies.</p></li><li><p>None of exactly one of the following:
+ <ul><li><p>A <code>disambigClassRef</code> attribute that contains a URI, specifying the type class of the concept
+ or entity behind the selector.</p></li><li><p>A <code>disambigClassPointer</code> attribute that contains a <a href="#selectors" shape="rect">relative selector</a>
+ pointing to a node specifying the entity type class behind the selector.</p></li><li><p>A <code>disambigClassRefPointer</code> attribute that contains a <a href="#selectors" shape="rect">relative selector</a>
+ pointing to a node that holds a URI that specifies the entity type class behind the selector.</p></li></ul></p></li><li><p>An optional <code>disambigGranularity</code> attribute that contains a string, specifying the granularity
+ level of the disambiguation. The value can be one of the following identifiers: <code>lexicalConcept</code>, <code>ontologyConcept</code>, or <code>entity</code>.</p></li><li><p>An optional <code>disambigSource</code> attribute. It contains a string representing the disambiguation
+ identifier collection source.</p></li><li><p>None of exactly one of the following:
+ <ul><li><p>A <code>disambigIdent</code> attribute. It contains a string that represents the disambiguation
+ identifier for the disambiguation target that is valid within the specified Disambiguation Source.</p></li><li><p>A <code>disambigIdentRef</code> attribute. It contains an URI that represents a unique identifier
+ for the disambiguation target.</p></li><li><p>A <code>disambigIdentPointer</code> attribute. It contains a <a href="#selectors" shape="rect">relative selector</a>
+ pointing to a node that represents a unique identifier for the disambiguation target.</p></li><li><p>a <code>disambigIdentRefPointer</code> attribute. It contains a <a href="#selectors" shape="rect">relative selector</a>
+ pointing to a node that holds a URI that represents a unique identifier for the disambiguation target.</p></li></ul></p></li></ul><p>When using a disambiguation rule, the user <a href="#rfc2119" shape="rect">MUST</a> use one of the use cases for disambiguation:
+ specifying the target type, or specifying the target identity.
+ For the latter, the user <a href="#rfc2119" shape="rect">MUST</a> use only one of the two addressing modes:</p><ul><li><p>Using <code>disambigSource</code> and <code>disambigIdent</code> to specify the collection and the identifier itself.</p></li><li><p>Using one of <code>disambigIdentRef</code>, <code>disambigIdentPointer</code> or <code>disambigIdentRefPointer</code> using
+ a URI for the disambiguation target.</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-global-1" id="EX-disambiguation-global-1" shape="rect"/>Example 52: Usage of <code>entityTypeSourceRef</code>, <code>enttiyTypeRef</code>, <code>disambigSourceRef</code>,
+ <code>disambigIdentRef</code> for both entity and word sense disambiguation.</div><div class="exampleInner"><pre xml:space="preserve">
<text
xmlns:its="http://www.w3.org/2005/11/its" >
<its:rules version="2.0">
@@ -2225,19 +2186,14 @@
</body>
</text></pre></div><p>[Source file: <a href="examples/xml/EX-disambiguation-global-1.xml" shape="rect">examples/xml/EX-disambiguation-global-1.xml</a>]</p></div><p id="disambiguation-local">LOCAL: The following local markup is
available for the <a href="#Disambiguation" shape="rect">Disambiguation</a>
- data category:</p><ul><li><p>An optional <code>entityTypeSourceRef</code> attribute that contains
- an URI specifying the concrete identifier data source (knowledge
- base, semantic network), used to determine the entity type.</p></li><li><p>An optional <code>entityTypeRef</code> attribute that contains a URI
- specifying the entity type behind the selector.</p></li><li><p>An optional <code>disambigType</code> attribute that contains a
- string, specifying the specific semantics of the disambiguation. It
- can be one of "lexicalConcept", "ontologyConcept", or
- "entity".</p></li><li><p>An optional <code>disambigSourceRef</code> attribute. It contains a
- URI representing the disambiguation identifier collection
- source.</p></li><li><p>An optional <code>disambigIdentRef</code> attribute. It contains a
- URI that represents a unique identifier within the identifier
- collection.</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-local-1" id="EX-disambiguation-html5-local-1" shape="rect"/>Example 53: Local mixed usage of <code>entityTypeSourceRef</code>,
- <code>enttiyTypeRef</code>, <code>disambigSourceRef</code>,
- <code>disambigIdentRef</code> in HTML.</div><div class="exampleInner"><pre xml:space="preserve"><!DOCTYPE html>
+ data category:</p><ul><li><p>An optional <code>disambigClassRef</code> attribute that contains a URI, specifying the type class
+ of the concept or entity behind the selector.</p></li><li><p>An optional <code>disambigGranularity</code> attribute that contains a string, specifying the
+ granularity level of the disambiguation. The value can be one of the following identifiers: <code>lexicalConcept</code>, <code>ontologyConcept</code>, or <code>entity</code></p></li><li><p>An optional <code>disambigSource</code> attribute. It contains a string representing the
+ disambiguation identifier collection source.</p></li><li><p>An optional <code>disambigIdent</code> attribute. It contains a string, representing the
+ disambiguation identifier for the disambiguation target that is valid within the specified Disambiguation Source.</p></li><li><p>An optional <code>disambigIdentRef</code> attribute. It contains a URI that represents a unique
+ identifier for the disambiguation target.</p></li></ul><p>The user <a href="#rfc2119" shape="rect">MUST</a> use only one of the two addressing modes for disambiguation:</p><ul><li><p>Using <code>disambigSource</code> and <code>disambigIdent</code> to specify the collection
+ and the identifier itself.</p></li><li><p>Using <code>disambigIdentRef</code> using a URI for the disambiguation target</p></li></ul><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-local-1" id="EX-disambiguation-html5-local-1" shape="rect"/>Example 53: Local mixed usage of <code>entityTypeSourceRef</code>, <code>enttiyTypeRef</code>,
+ <code>disambigSourceRef</code>, <code>disambigIdentRef</code> in HTML.</div><div class="exampleInner"><pre xml:space="preserve"><!DOCTYPE html>
<html lang=en>
<head>
<meta charset=utf-8>
@@ -2255,20 +2211,13 @@
its-disambig-source-ref=http://www.w3.org/2006/03/wn/wn20/rdf/wordnet-synset.rdf
its-disambig-type=lexicalConcept>capital</span> of Ireland.</p>
</body>
- </html></pre></div><p>[Source file: <a href="examples/html5/EX-disambiguation-html5-local-1.html" shape="rect">examples/html5/EX-disambiguation-html5-local-1.html</a>]</p></div><div class="note"><p class="prefix"><b>Note:</b></p><p>While the <code>entityTypeSourceRef</code> attribute allows for an
- arbitrary domain of entity types, the implementors are encouraged to
- use an existing repository of entity types as long as they satisfy
- their requirements. For example, the Named Entity Recognition and
- Disambiguation ontology (NERD): http://nerd.eurecom.fr/ontology</p><p>The distinction between disambiguating word sense and entities is
- mainly in the different semantics: whereas word sense disambiguation
- targets literal words and their senses on the lexical level, entity
- disambiguation targets real-world concepts that are behind the
- selected phrases on the conceptual level.</p><p>When serializing the ITS markup in HTML5, the preferred way is to
- serialize in RDFa Lite or Microdata due to the existing search and
- crawling infrastructure that is able to consume this kind of
- data.</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-rdfa" id="EX-disambiguation-html5-rdfa" shape="rect"/>Example 54: Local mixed usage of <code>entityTypeSourceRef</code>,
- <code>entityTypeRef</code>, <code>disambigSourceRef</code>,
- <code>disambigIdentRef</code> in HTML+RDFa Lite</div><p>See <a href="#EX-disambiguation-html5-rdfa-companion-document" shape="rect">Example 55</a> for the companion document with the mapping
+ </html></pre></div><p>[Source file: <a href="examples/html5/EX-disambiguation-html5-local-1.html" shape="rect">examples/html5/EX-disambiguation-html5-local-1.html</a>]</p></div><div class="note"><p class="prefix"><b>Note:</b></p><p>For referring to <code>disambigClassRef</code> values, implementors are encouraged to use an existing
+ repository of entity types as long as they satisfy their requirements. For example,
+ the Named Entity Recognition and Disambiguation ontology (NERD): http://nerd.eurecom.fr/ontology</p><p>Furthermore, valid target types depend on the disambiguation granularity: types of entities are distinct
+ from types of lexical concepts or ontology concepts. While this distinction exists, the specification does not prescribe
+ a way of automatically inferring a disambiguation level from a target type.</p><p>When serializing the ITS mark-up in HTML5, the preferred way is to serialize in RDFa Lite or Microdata due
+ to the existing search and crawling infrastructure that is able to consume this kind of data.</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-rdfa" id="EX-disambiguation-html5-rdfa" shape="rect"/>Example 54: Local mixed usage of <code>entityTypeSourceRef</code>, <code>enttiyTypeRef</code>, <code>disambigSourceRef</code>,
+ <code>disambigIdentRef</code> in HTML+RDFa Lite.</div><p>See <a href="#EX-disambiguation-html5-rdfa-companion-document" shape="rect">Example 55</a> for the companion document with the mapping
data.</p><div class="exampleInner"><pre xml:space="preserve"><!DOCTYPE html>
<html lang=en>
<head>
@@ -2279,9 +2228,7 @@
<p>
<span property=name resource=http://dbpedia.org/resource/Dublin typeof=http:/nerd.eurecom.fr/ontology#Place>Dublin</span> is the capital of Ireland.</p>
</body>
- </html></pre></div><p>[Source file: <a href="examples/html5/EX-disambiguation-html5-rdfa.html" shape="rect">examples/html5/EX-disambiguation-html5-rdfa.html</a>]</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-rdfa-companion-document" id="EX-disambiguation-html5-rdfa-companion-document" shape="rect"/>Example 55: Local mixed usage of <code>entityTypeSourceRef</code>,
- <code>entityTypeRef</code>, <code>disambigSourceRef</code>,
- <code>disambigIdentRef</code> in HTML+RDFa Lite</div><p>Companion document, having the mapping data for <a href="#EX-disambiguation-html5-rdfa" shape="rect">Example 54</a>.</p><div class="exampleInner"><pre xml:space="preserve">
+ </html></pre></div><p>[Source file: <a href="examples/html5/EX-disambiguation-html5-rdfa.html" shape="rect">examples/html5/EX-disambiguation-html5-rdfa.html</a>]</p></div><div class="exampleOuter"><div class="exampleHeader"><a name="EX-disambiguation-html5-rdfa-companion-document" id="EX-disambiguation-html5-rdfa-companion-document" shape="rect"/>Example 55: Companion document, having the mapping data for <a href="#EX-disambiguation-html5-rdfa" shape="rect">Example 54</a>.</div><div class="exampleInner"><pre xml:space="preserve">
<its:rules
xmlns:its="http://www.w3.org/2005/11/its" version="2.0">
<its:disambiguationRule selector="//*[@typeof]" entityTypeRefPointer="@typeof"/>
@@ -3765,7 +3712,7 @@
<em>This section is informative.</em>
</p><p>Several constraints of ITS markup cannot be validated with ITS schemas. The
following <a title="Rule-based validation
							-- Schematron" href="#schematron" shape="rect">[Schematron]</a> document allows for
- validating some of these constraints.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e9185" id="d3e9185" shape="rect"/>Example 97: Testing constraints in ITS markup</div><div class="exampleInner"><pre xml:space="preserve">
+ validating some of these constraints.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e9219" id="d3e9219" shape="rect"/>Example 97: Testing constraints in ITS markup</div><div class="exampleInner"><pre xml:space="preserve">
<sch:schema
xmlns:sch="http://www.ascc.net/xml/schematron" >
<!-- Schematron document to test constraints for global and local ITS markup.
@@ -3833,7 +3780,7 @@
</p><p>The following <a title="Namespace-based Validation
							Dispatching Language (NVDL)" href="#nvdl" shape="rect">[NVDL]</a> document allows validation of
ITS markup which has been added to a host vocabulary. Only ITS elements and
attributes are checked. Elements and attributes of host language are ignored
- during validation against this NVDL document/schema.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e9207" id="d3e9207" shape="rect"/>Example 98: NVDL schema for ITS</div><div class="exampleInner"><pre xml:space="preserve">
+ during validation against this NVDL document/schema.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e9241" id="d3e9241" shape="rect"/>Example 98: NVDL schema for ITS</div><div class="exampleInner"><pre xml:space="preserve">
<nvdl:rules
xmlns:nvdl="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" >
<nvdl:namespace ns="http://www.w3.org/2005/11/its">
Received on Tuesday, 16 October 2012 13:17:15 UTC