- From: CVS User jkosek <cvsmail@w3.org>
- Date: Sat, 01 Jun 2013 14:35:49 +0000
- To: public-multilingualweb-lt-commits@w3.org
Update of /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20 In directory gil:/tmp/cvs-serv23437 Modified Files: its20.odd Log Message: Edits from Arle for section 5 --- /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.odd 2013/06/01 11:39:34 1.434 +++ /w3ccvs/WWW/International/multilingualweb/lt/drafts/its20/its20.odd 2013/06/01 14:35:49 1.435 @@ -1218,14 +1218,13 @@ mandatory for the <gi>rules</gi> element, where it <ref target="#rfc-keywords" >MUST</ref> be in no namespace.</p> <p>If there is no <gi>rules</gi> element in an XML document, a prefixed ITS - <att>version</att> attribute (e.g. <code>its:version</code>) <ref + <att>version</att> attribute (e.g., <code>its:version</code>) <ref target="#rfc-keywords">MUST</ref> be provided on the element where the ITS markup is used, or on one of its ancestors.</p> <p>If there is no <gi>rules</gi> element and there are elements with standoff ITS markup - in an XML document, an ITS - <att>version</att> attribute <ref - target="#rfc-keywords">MUST</ref> be provided on element with standoff ITS markup or a prefixed ITS - <att>version</att> attribute (e.g. <code>its:version</code>) <ref + in an XML document, an ITS <att>version</att> attribute <ref target="#rfc-keywords" + >MUST</ref> be provided on element with standoff ITS markup or a prefixed ITS + <att>version</att> attribute (e.g., <code>its:version</code>) <ref target="#rfc-keywords">MUST</ref> be provided on one of its ancestors.</p> <p>There <ref target="#rfc-keywords">MUST NOT</ref> be two different versions of ITS in the same document.</p> @@ -1252,11 +1251,11 @@ <p>The two locations are described in detail below.</p> <div xml:id="selection-global"> <head>Global, Rule-based Selection</head> - <p>Global, rule-based selection is implemented using the <gi>rules</gi> element. It - contains zero or more <ref target="#rule-elements">rule elements</ref>. Each <ref - target="#rule-elements">rule element</ref> has a mandatory <att>selector</att> - attribute. This attribute and all other possible attributes on <ref - target="#rule-elements">rule elements</ref> are in the empty namespace and used + <p>Global, rule-based selection is implemented using the <gi>rules</gi> element. The + <gi>rules</gi> element contains zero or more <ref target="#rule-elements">rule + elements</ref>. Each <ref target="#rule-elements">rule element</ref> has a mandatory + <att>selector</att> attribute. This attribute and all other possible attributes on + <ref target="#rule-elements">rule elements</ref> are in the empty namespace and used without a prefix.</p> <p>If there is more than one <gi>rules</gi> element in an XML document, the rules from each section are to be processed at the same precedence level. The <gi>rules</gi> @@ -1326,7 +1325,7 @@ <div xml:id="queryLanguage"> <head>Choosing Query Language</head> - <p><ref target="#rule-elements">Rule elements</ref> have attributes which contain + <p><ref target="#rule-elements">Rule elements</ref> have attributes that contain absolute and relative selectors. Interpretation of these selectors depends on the actual query language. The query language is set by <att>queryLanguage</att> attribute on <gi>rules</gi> element. If <att>queryLanguge</att> is not specified XPath 1.0 is @@ -1339,7 +1338,7 @@ <div> <head>Absolute selector</head> <p>The absolute selector <ref target="#rfc-keywords">MUST</ref> be an XPath expression - which starts with "<code>/</code>". That is, it must be an <ref + that starts with "<code>/</code>". That is, it must be an <ref target="http://www.w3.org/TR/xpath/#NT-AbsoluteLocationPath"> AbsoluteLocationPath</ref> or union of <ref target="http://www.w3.org/TR/xpath/#NT-AbsoluteLocationPath"> @@ -1365,7 +1364,7 @@ expression to include a call to any other function.</p> </item> <item> - <p>The set of namespace declarations are those in scope on the element which has + <p>The set of namespace declarations are those in scope on the element that has the attribute in which the expression occurs. This includes the implicit declaration of the prefix <code>xml</code> required by the <ref target="#xmlns" >XML Namespaces Recommendation</ref>; the default namespace (as declared by @@ -1405,8 +1404,8 @@ <att>storageEncodingPointer</att>, <att>storageSizePointer</att>, <att>targetPointer</att>, <att>termInfoPointer</att>, <att>termInfoRefPointer</att>.</p> - <p>Context for evaluation of the XPath expression is same as for absolute selector - with the following changes:</p> + <p>Context for evaluation of the XPath expression is the same as for an absolute + selector with the following changes:</p> <list> <item> <p>Nodes selected by the expression in the <att>selector</att> attribute form the @@ -1431,26 +1430,26 @@ sense of <code>Selectors</code> as specified in <ptr target="#css3-selectors" type="bibref"/> to prevent confusion with the generic use of the word "selector". See <ref target="#css-selectors">The term CSS Selector</ref>.</p></note> - <note><p xml:id="css-selectors-implementations">The working group will not provide a CSS - Selectors based implementation; nevertheless there are several existing libraries, - which can translate CSS Selectors to XPath, so that XPath selectors based - implementations can be used.</p></note> + <note><p xml:id="css-selectors-implementations">The working group will not provide a CSS Selectors-based + implementation; nevertheless there are several existing libraries, that can + translate CSS Selectors to XPath so that XPath selectors-based implementations can + be used.</p></note> <note><p xml:id="css-selectors-and-attributes">CSS selectors have no ability to point to attributes.</p></note> - <p>CSS Selectors are identified by <code>css</code> value in <att>queryLanguage</att> - attribute.</p> + <p>CSS Selectors are identified by the value <code>css</code> in the + <att>queryLanguage</att> attribute.</p> <div> <head>Absolute selector</head> - <p>Absolute selector <ref target="#rfc-keywords">MUST</ref> be interpreted as selector - as defined in <ptr target="#css3-selectors" type="bibref"/>. Both simple selectors - and groups of selectors can be used.</p> + <p>An absolute selector <ref target="#rfc-keywords">MUST</ref> be interpreted as a + selector as defined in <ptr target="#css3-selectors" type="bibref"/>. Both simple + selectors and groups of selectors can be used.</p> </div> <div> <head>Relative selector</head> - <p>Relative selector <ref target="#rfc-keywords">MUST</ref> be interpreted as selector - as defined in <ptr target="#css3-selectors" type="bibref"/>. Selector is not - evaluated against the complete document tree but only against subtrees rooted at - nodes selected by selector in the <att>selector</att> attribute.</p> + <p>A relative selector <ref target="#rfc-keywords">MUST</ref> be interpreted as a + selector as defined in <ptr target="#css3-selectors" type="bibref"/>. A selector is + not evaluated against the complete document tree but only against subtrees rooted at + nodes selected by the selector in the <att>selector</att> attribute.</p> </div> </div> <div> @@ -1459,9 +1458,9 @@ languages. For each additional query language the processor <ref target="#rfc-keywords">MUST</ref> define:</p> <list type="bulleted"> - <item>identifier of query language used in <att>queryLanguage</att>;</item> - <item>rules for evaluating absolute selector to collection of nodes;</item> - <item>rules for evaluating relative selector to collection of nodes.</item> + <item>the identifier of the query language used in <att>queryLanguage</att>;</item> + <item>rules for evaluating an absolute selector to a collection of nodes;</item> + <item>rules for evaluating a relative selector to a collection of nodes.</item> </list> <p>Because future versions of this specification are likely to define additional query languages, the following query language identifiers are reserved: <code>xpath</code>, @@ -1473,15 +1472,16 @@ <p xml:id="parameter-for-rules">A <gi>param</gi> element (or several ones) can be placed as the first child element(s) of the <gi>rules</gi> element to define the default values of variables used in the various selectors used in the rules.</p> - <p>Implementation <ref target="#rfc2119">MUST</ref> support the <gi>param</gi> element - for all query languages it supports and which at the same time define how variables - are bind for evaluation of selector expression. Implementations <ref target="#rfc2119" - >SHOULD</ref> also provide means for changing the default values of the - <gi>param</gi> elements. Such means are implementation-specific.</p> - <p>The <gi>param</gi> element has a required name attribute. The value of the name - attribute is a <ref target="http://www.w3.org/TR/2009/REC-xml-names-20091208/#NT-QName" - >QName</ref>, see <ptr target="#xmlns" type="bibref"/>. The content of the element - is a string used as default value for the corresponding variable.</p> + <p>An implementation <ref target="#rfc2119">MUST</ref> support the <gi>param</gi> + element for all query languages it supports and at the same time define how variables + are bound for evaluation of the selector expression. Implementations <ref + target="#rfc2119">SHOULD</ref> also provide means for changing the default values of + the <gi>param</gi> elements. Such means are implementation-specific.</p> + <p>The <gi>param</gi> element has a required <att>name</att> attribute. The value of the + <att>name</att> attribute is a <ref + target="http://www.w3.org/TR/2009/REC-xml-names-20091208/#NT-QName">QName</ref>, see + <ptr target="#xmlns" type="bibref"/>. The content of the element is a string used as + default value for the corresponding variable.</p> <exemplum xml:id="EX-param-in-global-rules-1"> <head>Using the <gi>param</gi> element to define the default value of a variable in a <att>selector</att> attribute.</head> @@ -1505,8 +1505,8 @@ XLink <ptr target="#xlink1" type="bibref"/><att>href</att> attribute in the <gi>rules</gi> element. The referenced document must be a valid XML document containing at most one <gi>rules</gi> element. That <gi>rules</gi> element can be the - root element or anywhere within the document tree (for example, the document could be an - XML Schema).</p> + root element or be located anywhere within the document tree (for example, the document + could be an XML Schema).</p> <p>The rules contained in the referenced document <ref target="#rfc-keywords">MUST</ref> be processed as if they were at the top of the <gi>rules</gi> element with the XLink @@ -1535,10 +1535,10 @@ </exemplum> <exemplum xml:id="EX-link-external-rules-4"> <head>External rules file with the <gi>rules</gi> element as the root element</head> - <p>Like <ptr target="#EX-link-external-rules-1" type="exref"/>, these rules can be - applied e.g. to <ptr type="exref" target="#EX-link-external-rules-2"/>. The only - difference is that in <ptr target="#EX-link-external-rules-4" type="exref"/>, the - <gi>rules</gi> element is the root element of the external file.</p> + <p>As with <ptr target="#EX-link-external-rules-1" type="exref"/>, these rules can be + applied to <ptr type="exref" target="#EX-link-external-rules-2"/>. The only difference + is that in <ptr target="#EX-link-external-rules-4" type="exref"/>, the <gi>rules</gi> + element is the root element of the external file.</p> <egXML xmlns="http://www.tei-c.org/ns/Examples" target="examples/xml/EX-link-external-rules-4.xml"/> </exemplum> @@ -1558,9 +1558,9 @@ <p>The following precedence order is defined for selections of ITS information in various positions (the first item in the list has the highest precedence):</p> <list type="ordered"> - <item xml:id="precedence-local">Selection via explicit (that is, not inherited) local - ITS markup in documents (<ref target="#local-attributes">ITS local attributes</ref> on - a specific element)</item> + <item xml:id="precedence-local">Selection via explicit (i.e., not inherited) local ITS + markup in documents (<ref target="#local-attributes">ITS local attributes</ref> on a + specific element)</item> <item xml:id="precedence-global-in-doc"><p>Global selections in documents (using a <gi>rules</gi> element)</p> <p>Inside each <gi>rules</gi> element the precedence order is: <list type="ordered"> @@ -1573,7 +1573,7 @@ </item> <item xml:id="precedence-inheritance">Selection via inherited values. This applies only to element nodes. The inheritance rules are laid out in a dedicated <ref - target="#datacategories-overview">datacategory overview table</ref>, see column + target="#datacategories-overview">datacategory overview table</ref>: see the column <quote>Inheritance for element nodes</quote>. Selection via inheritance takes precedence over default values, see below item.</item> <item xml:id="precedence-defaults">Selections via defaults for data categories, see <ptr @@ -1586,17 +1586,17 @@ <p>The precedence order fulfills the same purpose as the built-in template rules of <ptr target="#xslt10" type="bibref"/>. Override semantics are always complete, that is all information provided via lower precedence is overriden by the higher precedence. - E.g. defaults are overridden by inherited values, these are overriden by nodes + E.g. defaults are overridden by inherited values and these are overriden by nodes selected via global rules, which are in turn overridden by local markup.</p> </note> <exemplum xml:id="EX-selection-precedence-1"> - <head>Conflicts between selections of ITS information which are resolved using the - precedence order</head> + <head>Conflicts between selections of ITS information resolved using the precedence + order</head> <p>The two elements <code>title</code> and <code>author</code> of this document should - be treated as separate content when inside a <code>prolog</code> element, but as part - of the content of their parent element otherwise. In order to make this distinction - two <gi>withinTextRule</gi> elements are used:</p> + be treated as separate content when inside a <code>prolog</code> element, but in other + contexts as part of the content of their parent element. In order to make this + distinction two <gi>withinTextRule</gi> elements are used:</p> <p>The first rule specifies that <code>title</code> and <code>author</code> in general should be treated as an element within text. This overrides the default.</p> <p>The second rule indicates that when <code>title</code> or <code>author</code> are @@ -1620,9 +1620,9 @@ </div> <div xml:id="associating-its-with-existing-markup"> <head>Associating ITS Data Categories with Existing Markup</head> - <p>Some markup schemes provide markup which can be used to express ITS data categories. - ITS data categories can be associated with such existing markup, using the global - selection mechanism described in <ptr type="specref" target="#selection-global"/>.</p> + <p>Some markup schemes provide markup that can be used to express ITS data categories. ITS + data categories can be associated with such existing markup, using the global selection + mechanism described in <ptr type="specref" target="#selection-global"/>.</p> <p>Associating existing markup with ITS data categories can be done only if the processing expectations of the host markup are the same as, or greater than, those of ITS. For example, the <ptr target="#dita10" type="bibref"/> format can use its translate @@ -1652,8 +1652,8 @@ </list> </item> <item>By associating the rules and the document through a tool-specific mechanism. For - example, for a command-line tool: providing the paths of both the XML document to - process and its corresponding external rules file.</item> + example, in the case of a command-line tool by providing the paths of both the XML + document to process and its corresponding external rules file.</item> </list> </div> @@ -1662,14 +1662,14 @@ <p>This section defines an algorithm to convert XML or HTML documents (or their DOM representations) that contain ITS metadata to the RDF-based format based on <ptr target="#nif-reference" type="bibref"/>. The conversion results in RDF triples.</p> - <note><p>The algorithm is intended to extract the text from the XML/HTML/DOM for an NLP - tool. It can produce a lot of <quote>phantom</quote> predicates from excessive - whitespace, which 1) increases the size of the intermediate mapping and 2) extracts - this whitespace as text, and therefore might decrease NLP performance. It is strongly recommended to + <note><p>The algorithm is intended to extract the text from the XML/HTML/DOM for an NLP tool. It can + produce a lot of <quote>phantom</quote> predicates from excessive whitespace, which 1) + increases the size of the intermediate mapping and 2) extracts this whitespace as + text, and therefore might decrease NLP performance. It is strongly recommended to normalize whitespace in the input XML/HTML/DOM in order to minimize such phantom predicates. A normalized example is given below. Since the whitespace normalization - algorithm itself is format dependent, for example, it differs for HTML compared to general - XML, no normative algorithm for whitespace normalization is given as part of + algorithm itself is format dependent (for example, it differs for HTML compared to + general XML), no normative algorithm for whitespace normalization is given as part of this specification.</p></note> <note><p xml:id="its-rdf-ontology-status">The output of the algorithm shown below uses the ITS RDF ontology <ptr target="#its-rdf-ontology" type="bibref"/> and its namespace<?br?><ref target="http://www.w3.org/2005/11/its/rdf#">http://www.w3.org/2005/11/its/rdf#</ref><?br?>This ontology is not a normative part of the ITS 2.0 specification and is being discussed in the <ref target="http://www.w3.org/International/its/wiki/ITS-RDF_mapping">ITS Interest Group</ref>.</p></note> <exemplum xml:id="EX-HTML-whitespace-normalization"> @@ -1759,7 +1759,8 @@ <item><p xml:id="its2nif-algorithm-step5">STEP 5: Create a context URI and attach the whole concatenated text <code>$(t0+t1+t2+...+tn)</code> of the document as reference.</p></item> <item><p xml:id="its2nif-algorithm-step6">STEP 6: Attach any ITS metadata annotations from the XML/HTML/DOM input to the respective NIF URIs.</p></item> - <item><p xml:id="its2nif-algorithm-step7">STEP 7: Omit all URIs that do not carry annotations (they will just bloat the data).</p></item> + <item><p xml:id="its2nif-algorithm-step7">STEP 7: Omit all URIs that do not carry annotations (to avoid + bloating the data).</p></item> </list> <eg rend="text"><![CDATA[@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> . @prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> @@ -1786,7 +1787,11 @@ target="#EX-HTML-whitespace-normalization" type="exref"/>, is available at <ref target="examples/nif/EX-nif-conversion-output.xml" >examples/nif/EX-nif-conversion-output.xml</ref>.</p> - <note><p>The conversion to NIF is a possible basis for a natural language processing (NLP) application that creates, for example, named entity annotations. A non-normative algorithm to integrate these annotations into the original input document is given in <ptr target="#nif-backconversion" type="specref"/>. This algorithm is non-normative because many decisions depend on the actually employed NLP application.</p></note> + <note><p>The conversion to NIF is a possible basis for a natural language processing (NLP) application + that creates, for example, named entity annotations. A non-normative algorithm to + integrate these annotations into the original input document is given in <ptr + target="#nif-backconversion" type="specref"/>. This algorithm is non-normative + because many decisions depend on the particular NLP application being used.</p></note> </div> <div xml:id="its-tool-annotation"> <head>ITS Tools Annotation</head> @@ -1794,12 +1799,12 @@ with information about the processor that generated them. For example, the score of the <ref target="#mtconfidence">MT Confidence</ref> data category (provided via the <att>mtConfidence</att> attribute) is meaningful only when the consumer of the - information also knows what MT engine produced it, because the score provides the + information also knows which MT engine produced it, because the score provides the relative confidence of translations from the same MT engine but does not provide a score that can be reliably compared between MT engines. The same is true for confidence provided for the <ref target="#textanalysis">Text Analysis</ref> data category, - providing confidence information via the <att>taConfidence</att> attribute, or the - <ref target="#terminology">Terminology</ref> data category, providing confidence + providing confidence information via the <att>taConfidence</att> attribute, or the <ref + target="#terminology">Terminology</ref> data category, providing confidence information via the <att>termConfidence</att> attribute.</p> <p>ITS 2.0 provides a mechanism to associate such processor information with the use of @@ -1837,18 +1842,18 @@ <p>The value of <att>annotatorsRef</att> is a space-separated list of references where each reference is composed of two parts: a data category identifier and an IRI. These - two parts are separated by a character <code>|</code> VERTICAL LINE (U+007C).</p> + two parts are separated by a <code>|</code> VERTICAL LINE (U+007C) character.</p> <list> <item><p>The data category identifier <ref target="#rfc2119">MUST</ref> be one of the identifiers specified in the <ref target="#datacategories-overview">data category overview table</ref>.</p></item> - <item><p>The IRI indicates information about the processor used to generate the data - category annotation. No single means is specified for how this IRI should be used to - indicate processor information. Possible mechanisms are: to encode information - directly in the IRI, e.g. as parameters; to reference an external resource that - provides such information, e.g. an XML file or an RDF declaration; or to reference - another part of the document that provides such information.</p></item> + <item><p>The IRI indicates information about the processor used to generate the data category annotation. + No single means is specified for how this IRI should be used to indicate processor + information. Possible mechanisms are: to encode information directly in the IRI, + e.g., as parameters; to reference an external resource that provides such + information, e.g. an XML file or an RDF declaration; or to reference another part of + the document that provides such information.</p></item> </list> <p>In HTML documents, the mechanism is implemented with the <att>its-annotators-ref</att>
Received on Saturday, 1 June 2013 14:35:52 UTC