W3C home > Mailing lists > Public > public-multilingualweb-lt-commits@w3.org > September 2013

CVS WWW/International/multilingualweb/lt/drafts/its20

From: CVS User fsasaki <cvsmail@w3.org>
Date: Fri, 06 Sep 2013 11:09:51 +0000
Message-Id: <E1VHtvD-0001Vk-8q@gil.w3.org>
To: public-multilingualweb-lt-commits@w3.org
Update of /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20
In directory gil:/tmp/cvs-serv5804

Modified Files:
	its20.html its20.odd 
Log Message:
implemented further nif section fixes, see http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Sep/0027.html 

--- /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.html	2013/09/06 10:11:40	1.494
+++ /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.html	2013/09/06 11:09:50	1.495
@@ -5594,7 +5594,7 @@
                <a href="http://www.xulplanet.com/" shape="rect"><cite>exTensible User Interface Language</cite></a>. Available at <a href="http://www.xulplanet.com/" shape="rect">
               http://www.xulplanet.com/</a>.</dd></dl></div><div class="div1">
 <h2><a href="#contents" shape="rect"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="conversion-to-nif" id="conversion-to-nif" shape="rect"/>F Conversion to NIF (Non-Normative)</h2><p>This section provides an informative algorithm to convert XML or HTML documents (or their DOM
-          representations) that contain ITS metadata to the RDF format based on <a title="" href="#nif-reference" shape="rect">[NIF]</a>. The conversion results in RDF triples.</p><div class="note"><p class="prefix"><b>Note:</b></p><p>The algorithm is intended to extract the text from the XML/HTML/DOM for an NLP tool. It can
+          representations) that contain ITS metadata to the RDF format based on <a title="" href="#nif-reference" shape="rect">[NIF]</a>. The conversion results in RDF triples.</p><div class="note"><p class="prefix"><b>Note:</b></p><p>The algorithm creates URIs that in the query part contain the characters "[" and "]", as part of XPath expressions. In the conversion output (see an <a href="examples/nif/EX-nif-conversion-output.xml" shape="rect">example</a>), The URIs are escaped as "%5B" and "%5D". For readability the URIs shown in this section do not escape these characters.</p></div><div class="note"><p class="prefix"><b>Note:</b></p><p>The algorithm is intended to extract the text from the XML/HTML/DOM for an NLP tool. It can
           produce a lot of "<span class="quote">phantom</span>" predicates from excessive whitespace, which 1)
           increases the size of the intermediate mapping and 2) extracts this whitespace as
           text, and therefore might decrease NLP performance. It is strongly recommended to
@@ -5681,7 +5681,7 @@
     nif:beginIndex	 "0" ;
     nif:endIndex	 "29" ;
     itsrdf:translate     "yes";
-    nif:sourceUrl      &lt;http://example.com/exampledoc.html&gt; .
+    nif:sourceUrl      &lt;http://example.com/doc.html&gt; .
 &lt;http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html&amp;char=11,17&gt; 
     rdf:type              nif:RFC5147String ;
     nif:beginIndex	 "11" ;
@@ -5698,14 +5698,12 @@
 </pre></div></div><p>A complete sample output in RDF/XML format after step 7, given the input document <a href="#EX-HTML-whitespace-normalization" shape="rect">Example 97</a>, is available at <a href="examples/nif/EX-nif-conversion-output.xml" shape="rect">examples/nif/EX-nif-conversion-output.xml</a>.</p><div class="note"><p class="prefix"><b>Note:</b></p><p>The conversion to NIF is a possible basis for a natural language processing (NLP) application
           that creates, for example, named entity annotations. A non-normative algorithm to
           integrate these annotations into the original input document is given in <a class="section-ref" href="#nif-backconversion" shape="rect">Appendix G: Conversion NIF2ITS</a>. Many decisions to be made in this algorithm 
-          depend on the particular NLP application being used.</p></div><div class="note"><p class="prefix"><b>Note:</b></p><p>NIF allows URL for a String resource to be referenced as URIs 
+          depend on the particular NLP application being used.</p></div><div class="note"><p class="prefix"><b>Note:</b></p><p>NIF allows an URL for a String resource to be referenced as URIs 
             that are fragments of the original document in the form:<br clear="none"/><code>http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html#char=0,11</code>
-               <br clear="none"/>or
-            <br clear="none"/><code>http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html&amp;xpath=/html/body[1]/h2[1]/text()[1]</code>
+               <br clear="none"/>or<br clear="none"/><code>http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html&amp;xpath=/html/body[1]/h2[1]/text()[1]</code>
                <br clear="none"/>
-            
             This offers a convenient mechanism for linking NIF resources in RDF back 
-            to the original document. RDF treats URIs as opaque and does not impose 
+            to the original document. The <a href="http://persistence.uni-leipzig.org/nlp2rdf/specification/api.html" shape="rect">NIF Web Service Access Specification</a> defines the parameters for NIF web services.</p><p>RDF treats URIs as opaque and does not impose 
             any semantic constraints on the used fragment identifiers, thus enabling 
             their usage in RDF in a consistent manner. However, fragment identifiers 
             get interpreted according to the retrieved mime type, if a retrieval 
@@ -5743,7 +5741,7 @@
 &lt;http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html&amp;char=21,28&gt; 
  itsrdf:taIdentRef  &lt;http://dbpedia.org/resource/Ireland&gt; .
 # we can attach the metadata to the parent node:
-&lt;b its-ta-ident-ref="http://dbpedia.org/resource/Dublin" 
+&lt;b its-ta-ident-ref="http://dbpedia.org/resource/Ireland" 
    translate="no"&gt;Ireland&lt;/b&gt;
 </pre></div></div><p>CASE 2: The NLP annotation created in NIF is a substring of the text node. Solution:
           Create a new element, e.g., for HTML "span". A different input example is given below as
--- /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.odd	2013/09/06 10:11:40	1.509
+++ /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.odd	2013/09/06 11:09:51	1.510
@@ -5536,6 +5536,7 @@
         <p>This section provides an informative algorithm to convert XML or HTML documents (or their DOM
           representations) that contain ITS metadata to the RDF format based on <ptr
             target="#nif-reference" type="bibref"/>. The conversion results in RDF triples.</p>
+        <note> <p>The algorithm creates URIs that in the query part contain the characters "[" and "]", as part of XPath expressions. In the conversion output (see an <ref target="examples/nif/EX-nif-conversion-output.xml">example</ref>), The URIs are escaped as "%5B" and "%5D". For readability the URIs shown in this section do not escape these characters.</p></note>
         <note><p>The algorithm is intended to extract the text from the XML/HTML/DOM for an NLP tool. It can
           produce a lot of <quote>phantom</quote> predicates from excessive whitespace, which 1)
           increases the size of the intermediate mapping and 2) extracts this whitespace as
@@ -5645,7 +5646,7 @@
     nif:beginIndex	 "0" ;
     nif:endIndex	 "29" ;
     itsrdf:translate     "yes";
-    nif:sourceUrl      <http://example.com/exampledoc.html> .
+    nif:sourceUrl      <http://example.com/doc.html> .
 <http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html&char=11,17> 
     rdf:type              nif:RFC5147String ;
     nif:beginIndex	 "11" ;
@@ -5670,14 +5671,12 @@
             target="#nif-backconversion" type="specref"/>. Many decisions to be made in this algorithm 
           depend on the particular NLP application being used.</p></note>
         <note>
-          <p>NIF allows URL for a String resource to be referenced as URIs 
+          <p>NIF allows an URL for a String resource to be referenced as URIs 
             that are fragments of the original document in the form:<?br?>
-            <code>http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html#char=0,11</code>
-            <?br?>or
-            <?br?><code>http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html&amp;xpath=/html/body[1]/h2[1]/text()[1]</code><?br?>
-            
+            <code>http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html#char=0,11</code>    <?br?>or<?br?><code>http://example.com/myitsservice?informat=html&amp;intype=url&amp;input=http://example.com/doc.html&amp;xpath=/html/body[1]/h2[1]/text()[1]</code><?br?>
             This offers a convenient mechanism for linking NIF resources in RDF back 
-            to the original document. RDF treats URIs as opaque and does not impose 
+            to the original document. The <ref target="http://persistence.uni-leipzig.org/nlp2rdf/specification/api.html">NIF Web Service Access Specification</ref> defines the parameters for NIF web services.</p>
+          <p>RDF treats URIs as opaque and does not impose 
             any semantic constraints on the used fragment identifiers, thus enabling 
             their usage in RDF in a consistent manner. However, fragment identifiers 
             get interpreted according to the retrieved mime type, if a retrieval 
@@ -5738,7 +5737,7 @@
 <http://example.com/myitsservice?informat=html&intype=url&input=http://example.com/doc.html&char=21,28> 
  itsrdf:taIdentRef  <http://dbpedia.org/resource/Ireland> .
 # we can attach the metadata to the parent node:
-<b its-ta-ident-ref="http://dbpedia.org/resource/Dublin" 
+<b its-ta-ident-ref="http://dbpedia.org/resource/Ireland" 
    translate="no">Ireland</b>
 ]]></eg>
         <p>CASE 2: The NLP annotation created in NIF is a substring of the text node. Solution:
Received on Friday, 6 September 2013 11:09:52 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:32:43 UTC