its20 CVS commit from Yves Savourel via cvs-syncmail on 2012-10-10 (public-multilingualweb-lt-commits@w3.org from October 2012)

From: Yves Savourel via cvs-syncmail <cvsmail@w3.org>
Date: Wed, 10 Oct 2012 18:22:19 +0000
To: public-multilingualweb-lt-commits@w3.org
Message-Id: <E1TM0vD-00031t-Ke@lionel-hutz.w3.org>
Update of /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20
In directory hutz:/tmp/cvs-serv11585

Modified Files:
	its20.html its20.odd 
Log Message:
Added algorithm for Domain value.
Added wording about the dot-all assuption for Allowed Characters.

Index: its20.odd
===================================================================
RCS file: /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.odd,v
retrieving revision 1.176
retrieving revision 1.177
diff -u -d -r1.176 -r1.177
--- its20.odd	10 Oct 2012 12:03:10 -0000	1.176
+++ its20.odd	10 Oct 2012 18:22:17 -0000	1.177
@@ -2812,20 +2812,23 @@
 					<head>Domain</head>
 					<div xml:id="domain-definition">
 						<head>Definition</head>
-						<p>The <ref target="#domain">Domain</ref> data category is used to identify
-							the domain of content.</p>
+						<p>The <ref target="#domain">Domain</ref> data category is used to identify the topic or subject of a given content.
+							Such information allows to make more relevant lingusitic choices during various processes.</p>
+						<p>Examples of usage include:</p>
+						<list type="unordered">
+							<item>Allowing machine translation systems to select the most appropriate engine and rules to translate the content.</item>
+							<item>Providing a general indication of what terminology collection should be used by a translator.</item>
+						</list>
 						<p>This data category addresses various challenges:</p>
 						<list type="unordered">
-							<item>Often domain related information in content does exist, e.g.
-								keywords in the HTML <code>meta</code> element. The <ref
-									target="#domain">Domain</ref> data category addresses this by
-								providing a mechanism to point to this information.</item>
+							<item>Often domain-related information already exist in the document (e.g.
+								keywords in the HTML <code>meta</code> element). The <ref
+									target="#domain">Domain</ref> data category provides a mechanism to point to this information.</item>
 							<item>There are many flat or structured lists of domain related values,
-								keywords, key phrases, classification codes, ontologies. The <ref
-									target="#domain">Domain</ref> data category does not propose a
-								given list; rather it provides a mapping mechanism to associate
-								values in content with consumer tool specific values needed for
-								processing domain information.</item>
+								keywords, key phrases, classification codes, ontologies, etc. The <ref
+									target="#domain">Domain</ref> data category does not propose its own
+								given list. Instead it provides a mapping mechanism to associate
+								the values in the document with the values used by the consumer tool.</item>
 						</list>
 					</div>
 					<div xml:id="domain-implementation">
@@ -2835,6 +2838,60 @@
 								target="#def-inheritance">inherits</ref> to the textual content of
 							the element, <emph>including</emph> child elements and attributes. There
 							is no default.</p>
+						
+						<p>The information provided by this data category is a comma-separated list of one or more values which is obtained by applying the following algorithm:</p>
+						<list type="ordered">
+							<item>Set the initial value of the resulting string as a empty string.</item>
+							<item>Get the list of nodes resulting of the evaluation of the <att>domainPointer</att> attribute.</item>
+							<item>For each node:
+								<list type="ordered">
+									<item>If the node value contains a COMMA (U+002C):
+										<list type="ordered">
+											<item>Split the node value into separate strings using the COMMA (U+002C) as separator.</item>
+											<item>For each string:
+												<list type="ordered">
+													<item>Trim the leading and trailing white spaces of the string.</item>
+													<item>Check if there is a mapping for the string:
+														<list type="ordered">
+															<item>If one if found:
+																<list type="ordered" >
+																	<item>Add the corresponding value to the result string.</item>
+																</list>
+															</item>
+															<item>Otherwise (if no mapping is found):
+																<list type="ordered" >
+																	<item>Add the string to the result string.</item>
+																</list>
+															</item>
+														</list>
+													</item>
+												</list>
+											</item>
+										</list>
+									</item>
+									<item>If the node value does not contain a COMMA (U+002C)):
+										<list type="ordered">
+											<item>Trim the leading and trailing white spaces of the string.</item>
+											<item>Check if there is a mapping for the string:
+												<list type="ordered">
+													<item>If one if found:
+														<list type="ordered" >
+															<item>Add the corresponding value to the result string.</item>
+														</list>
+													</item>
+													<item>Otherwise (if no mapping is found):
+														<list type="ordered" >
+															<item>Add the string to the result string.</item>
+														</list>
+													</item>
+												</list>
+											</item>
+										</list>
+									</item>
+								</list>									
+							</item>
+							<item>Return the resulting string.</item>
+						</list>
 
 						<p xml:id="domain-global">GLOBAL: The <gi>domainRule</gi> element contains
 							the following:</p>
@@ -4163,7 +4220,8 @@
 						<p>The regular expression is a character class construct as defined in the
 							section <ref target="http://www.w3.org/TR/xmlschema-2/#charcter-classes"
 								>Character Classes</ref> of XML Schema <ptr target="#xmlschema2"
-								type="bibref"/>.</p>
+									type="bibref"/>, with the assumption that the <code>.</code> metacharacter matches also CARRIAGE RETURN (U+000D) and LINE FEED (U+000F).
+							That is with the <emph>dot-all</emph> option set.</p>
 						<p>Example of expressions (shown as XML source):</p>
 						<list type="unordered">
 							<item><code>[abc]</code> : allows the characters 'a', 'b' and

Index: its20.html
===================================================================
RCS file: /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.html,v
retrieving revision 1.179
retrieving revision 1.180
diff -u -d -r1.179 -r1.180
--- its20.html	10 Oct 2012 12:03:10 -0000	1.179
+++ its20.html	10 Oct 2012 18:22:17 -0000	1.180
@@ -125,7 +125,7 @@
 </div>
 </div>
 <div class="toc1">7 <a href="#html5-markup" shape="rect">Using ITS Markup in HTML5</a><div class="toc2">7.1 <a href="#html5-local-attributes" shape="rect">Mapping of Local Data Categories to HTML5</a></div>
-<div class="toc2">7.2 <a href="#d3e7499" shape="rect">Inline Global Rules in HTML5</a></div>
+<div class="toc2">7.2 <a href="#d3e7590" shape="rect">Inline Global Rules in HTML5</a></div>
 </div>
 <div class="toc1">8 <a href="#xhtml5-markup" shape="rect">Using ITS Markup in XHTML</a></div>
 </div>
@@ -1929,18 +1929,26 @@
   &lt;/body&gt;
  &lt;/html&gt;</pre></div><p>[Source file: <a href="examples/html5/EX-within-text-local-html5-1.html" shape="rect">examples/html5/EX-within-text-local-html5-1.html</a>]</p></div></div></div><div class="div2">
 <h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="domain" id="domain" shape="rect"/>6.9 Domain</h3><div class="div3">
-<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="domain-definition" id="domain-definition" shape="rect"/>6.9.1 Definition</h4><p>The <a href="#domain" shape="rect">Domain</a> data category is used to identify
-							the domain of content.</p><p>This data category addresses various challenges:</p><ul><li><p>Often domain related information in content does exist, e.g.
-								keywords in the HTML <code>meta</code> element. The <a href="#domain" shape="rect">Domain</a> data category addresses this by
-								providing a mechanism to point to this information.</p></li><li><p>There are many flat or structured lists of domain related values,
-								keywords, key phrases, classification codes, ontologies. The <a href="#domain" shape="rect">Domain</a> data category does not propose a
-								given list; rather it provides a mapping mechanism to associate
-								values in content with consumer tool specific values needed for
-								processing domain information.</p></li></ul></div><div class="div3">
+<h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="domain-definition" id="domain-definition" shape="rect"/>6.9.1 Definition</h4><p>The <a href="#domain" shape="rect">Domain</a> data category is used to identify the topic or subject of a given content.
+							Such information allows to make more relevant lingusitic choices during various processes.</p><p>Examples of usage include:</p><ul><li><p>Allowing machine translation systems to select the most appropriate engine and rules to translate the content.</p></li><li><p>Providing a general indication of what terminology collection should be used by a translator.</p></li></ul><p>This data category addresses various challenges:</p><ul><li><p>Often domain-related information already exist in the document (e.g.
+								keywords in the HTML <code>meta</code> element). The <a href="#domain" shape="rect">Domain</a> data category provides a mechanism to point to this information.</p></li><li><p>There are many flat or structured lists of domain related values,
+								keywords, key phrases, classification codes, ontologies, etc. The <a href="#domain" shape="rect">Domain</a> data category does not propose its own
+								given list. Instead it provides a mapping mechanism to associate
+								the values in the document with the values used by the consumer tool.</p></li></ul></div><div class="div3">
 <h4><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="domain-implementation" id="domain-implementation" shape="rect"/>6.9.2 Implementation</h4><p>The <a href="#domain" shape="rect">Domain</a> data category can be expressed
 							only with global rules. For elements, the data category information <a href="#def-inheritance" shape="rect">inherits</a> to the textual content of
 							the element, <em>including</em> child elements and attributes. There
-							is no default.</p><p id="domain-global">GLOBAL: The <code>domainRule</code> element contains
+							is no default.</p><p>The information provided by this data category is a comma-separated list of one or more values which is obtained by applying the following algorithm:</p><ol class="depth1"><li><p>Set the initial value of the resulting string as a empty string.</p></li><li><p>Get the list of nodes resulting of the evaluation of the <code>domainPointer</code> attribute.</p></li><li><p>For each node:
+								</p><ol class="depth2"><li><p>If the node value contains a COMMA (U+002C):
+										</p><ol class="depth3"><li><p>Split the node value into separate strings using the COMMA (U+002C) as separator.</p></li><li><p>For each string:
+												</p><ol class="depth4"><li><p>Trim the leading and trailing white spaces of the string.</p></li><li><p>Check if there is a mapping for the string:
+														</p><ol class="depth5"><li><p>If one if found:
+																</p><ol class="depth1"><li><p>Add the corresponding value to the result string.</p></li></ol><p/></li><li><p>Otherwise (if no mapping is found):
+																</p><ol class="depth1"><li><p>Add the string to the result string.</p></li></ol><p/></li></ol><p/></li></ol><p/></li></ol><p/></li><li><p>If the node value does not contain a COMMA (U+002C)):
+										</p><ol class="depth3"><li><p>Trim the leading and trailing white spaces of the string.</p></li><li><p>Check if there is a mapping for the string:
+												</p><ol class="depth4"><li><p>If one if found:
+														</p><ol class="depth5"><li><p>Add the corresponding value to the result string.</p></li></ol><p/></li><li><p>Otherwise (if no mapping is found):
+														</p><ol class="depth5"><li><p>Add the string to the result string.</p></li></ol><p/></li></ol><p/></li></ol><p/></li></ol><p/></li><li><p>Return the resulting string.</p></li></ol><p id="domain-global">GLOBAL: The <code>domainRule</code> element contains
 							the following:</p><ul><li><p>A required <code>selector</code> attribute. It contains an <a href="#selectors" shape="rect">absolute selector</a> which selects the
 								nodes to which this rule applies.</p></li><li><p>A required <code>domainPointer</code> attribute that contains a <a href="#selectors" shape="rect">relative selector</a> pointing to a node
 								that contains the domain information.</p></li><li><p>An optional <code>domainMapping</code> attribute that contains a
@@ -2026,8 +2034,7 @@
 								levels. For instance, the level of lexical concepts disambiguates
 								individual word surface forms, the level of ontology concepts
 								disambiguates into deeper semantics, and the entity disambiguation
-								works on the level of concrete instances. For instance, the word
-									"<span class="quote">City</span>" in "<span class="quote">I am going to the City</span>" may
+								works on the level of concrete instances. For instance, the word"<span class="quote">City</span>" in "<span class="quote">I am going to the City</span>" may
 								be disambiguated in one of the WordNet synsets that can be
 								represented by "<span class="quote">city</span>", an RDF ontology concept of a
 								City that could represent a subclass of a PopulatedPlace, or the
@@ -3047,7 +3054,8 @@
 								login name in a content.</p></li></ul><p>The set of characters that are allowed is specified using a regular
 							expression. That is, each character in the selected content <a href="#rfc-keywords" shape="rect">MUST</a> be included in the set specified
 							by the regular expression.</p><p>The regular expression is a character class construct as defined in the
-							section <a href="http://www.w3.org/TR/xmlschema-2/#charcter-classes" shape="rect">Character Classes</a> of XML Schema <a title="XML&#xA;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;Schema Part 2: Datatypes Second Edition" href="#xmlschema2" shape="rect">[XML Schema Part 2]</a>.</p><p>Example of expressions (shown as XML source):</p><ul><li><p><code>[abc]</code> : allows the characters 'a', 'b' and
+							section <a href="http://www.w3.org/TR/xmlschema-2/#charcter-classes" shape="rect">Character Classes</a> of XML Schema <a title="XML&#xA;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;Schema Part 2: Datatypes Second Edition" href="#xmlschema2" shape="rect">[XML Schema Part 2]</a>, with the assumption that the <code>.</code> metacharacter matches also CARRIAGE RETURN (U+000D) and LINE FEED (U+000F).
+							That is with the <em>dot-all</em> option set.</p><p>Example of expressions (shown as XML source):</p><ul><li><p><code>[abc]</code> : allows the characters 'a', 'b' and
 								'c'.</p></li><li><p><code>[a-c]</code> : allows the characters 'a', 'b' and
 								'c'.</p></li><li><p><code>[a-zA-Z]</code> : allows the characters from 'a' to 'z' and
 								from 'A' to 'Z'.</p></li><li><p><code>[^[abc]</code> : allows any characters except 'a', 'b', and
@@ -3235,7 +3243,7 @@
  &lt;/html&gt;</pre></div><p>[Source file: <a href="examples/html5/EX-storageSize-html5-local-1.html" shape="rect">examples/html5/EX-storageSize-html5-local-1.html</a>]</p></div></div></div></div><div class="div1">
 <h2><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-markup" id="html5-markup" shape="rect"/>7 Using ITS Markup in HTML5</h2><div class="div2">
 <h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="html5-local-attributes" id="html5-local-attributes" shape="rect"/>7.1 Mapping of Local Data Categories to HTML5</h3><span class="editor-note">[Ed. note: camelCase -&gt; its-*; special mapping of @lang, @translate and @dir]</span><span class="editor-note">[Ed. note: Case sensitivity]</span></div><div class="div2">
-<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d3e7499" id="d3e7499" shape="rect"/>7.2 Inline Global Rules in HTML5</h3><span class="editor-note">[Ed. note: Constraints on using rules inside script]</span></div></div><div class="div1">
+<h3><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d3e7590" id="d3e7590" shape="rect"/>7.2 Inline Global Rules in HTML5</h3><span class="editor-note">[Ed. note: Constraints on using rules inside script]</span></div></div><div class="div1">
 <h2><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="xhtml5-markup" id="xhtml5-markup" shape="rect"/>8 Using ITS Markup in XHTML</h2><span class="editor-note">[Ed. note: Guidance about using camelCase/its-camel-case w/respect to DOM representation and consistency with HTML parsing]</span><span class="editor-note">[Ed. note: Guidance about inline global rules]</span></div></div><div class="back"><div class="div1">
 <h2><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="normative-references" id="normative-references" shape="rect"/>A References</h2><dl><dt class="label"><a name="bcp47" id="bcp47" shape="rect"/>BCP47</dt><dd>Addison Phillips, Mark Davis. <a href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt" shape="rect"><cite>Tags for
 								Identifying Languages</cite></a>, September 2009. Available at <a href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt" shape="rect">
@@ -3571,7 +3579,7 @@
             <em>This section is informative.</em>
          </p><p>Several constraints of ITS markup cannot be validated with ITS schemas. The
 					following <a title="Rule-based validation&#xA;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;-- Schematron" href="#schematron" shape="rect">[Schematron]</a> document allows for
-					validating some of these constraints.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e8492" id="d3e8492" shape="rect"/>Example 94: Testing constraints in ITS markup</div><div class="exampleInner"><pre xml:space="preserve">
+					validating some of these constraints.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e8583" id="d3e8583" shape="rect"/>Example 94: Testing constraints in ITS markup</div><div class="exampleInner"><pre xml:space="preserve">
 &lt;sch:schema
   xmlns:sch="http://www.ascc.net/xml/schematron" &gt;
 &lt;!-- Schematron document to test constraints for global and local ITS markup.
@@ -3639,7 +3647,7 @@
          </p><p>The following <a title="Namespace-based Validation&#xA;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;&#x9;Dispatching Language (NVDL)" href="#nvdl" shape="rect">[NVDL]</a> document allows validation of
 					ITS markup which has been added to a host vocabulary. Only ITS elements and
 					attributes are checked. Elements and attributes of host language are ignored
-					during validation against this NVDL document/schema.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e8514" id="d3e8514" shape="rect"/>Example 95: NVDL schema for ITS</div><div class="exampleInner"><pre xml:space="preserve">
+					during validation against this NVDL document/schema.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d3e8605" id="d3e8605" shape="rect"/>Example 95: NVDL schema for ITS</div><div class="exampleInner"><pre xml:space="preserve">
 &lt;nvdl:rules
   xmlns:nvdl="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" &gt;
  &lt;nvdl:namespace ns="http://www.w3.org/2005/11/its"&gt;
Received on Wednesday, 10 October 2012 18:22:24 UTC