RE: Comments on ITS 2.0 specification WD from Pablo Nieto Caride on 2013-01-10 (public-multilingualweb-lt-comments@w3.org from January 2013)

From: Pablo Nieto Caride <pablo.nieto@linguaserve.com>
Date: Thu, 10 Jan 2013 11:34:24 +0100
To: "'Felix Sasaki'" <fsasaki@w3.org>, <public-multilingualweb-lt-comments@w3.org>
Message-ID: <00e501cdef1e$0f13b6e0$2d3b24a0$@linguaserve.com>
Hi Felix, all,

 

I went through Chase and Kevin's comments and checked spec, and I think the
comments are very interesting and worth discussing. I see you raised some
issues, I assume we will discuss them over the next calls, or would you
prefer me to send you my comments on email?

 

Cheers,

Pablo.

>>>>>>>>>>>>>>>>>>>>> 

 

Sending to the public comments list - Chase and Kevin are not subscribed to
this, so the comments didn't reach the list. Here they are. Thank you very
much for the comments, Chase and Kevin. We will discuss these in the group
and come back to you asap.

Best,

Felix

Am 10.01.13 08:39, schrieb Chase Tingley:

Hi,

 

Enclosed are our comments and questions concerning the ITS 2.0 working draft
dated December 6, 2012 ( <http://www.w3.org/TR/2012/WD-its20-20121206/>
http://www.w3.org/TR/2012/WD-its20-20121206/).  Please feel free to contact
us for clarifications if anything is unclear.

 

Section 5.4

Concerning recursive nesting of external rules, this statement could be
clearer:  

The linking mechanism is recursive, the deepest rules being overridden by
the top-most rules, if any.

 

We assume that this means that if rules file A includes rules file B, A is
"top-most" and its rules take precedence.  However, the terms "deepest" and
"top-most" seem prone to misinterpretation.

 

Section 5.5

The defined order of precedence includes (from highest to lowest priority):

.         non-inherited local markup

.         global selections in document via a rules element

.         data category defaults

This list seems to be missing inherited local markup.  Thus, the following
structure is ambiguous:

  <xml>

    <its:rules>

      <its:translateRule selector="//bar" translate="no" />

    </its:rules>

    <foo its:translate="yes">

      <bar>Is this translatable?</bar>

    </foo>

  </xml>

 

The <bar> element inherits a non-local "yes" value for its:translate, but is
also subject to a "no" value via the global rule.  Which takes precedence?
As implementors, our instinct  is that the inherited local markup ("yes")
has precedence, and the text is translatable.  However, this does not seem
clear from the specification.

 

Section 5.8 (annotatorsRef)

We have several questions concerning the correct implementation of this
attribute.

 

i) The list of possible types of tool information to be present includes

  2. information about tools that do 1), but also create ITS annotations

 

Since a subsequent note states that case 1) should be handled by the
provenance data category, is it correct to assume that in case 2), both a
provenance record (for text content that was created or modifed) and the
annotatorsRef (for ITS annotations that were created or modified) should be
used?

 

ii) Should annotatorsRef be updated when new provenance records are created?

 

iii) Can a single annotatorsRef attribute value contain multiple entries for
a single data category?  For example, if multiple automated quality tools
(with IRIs "FOO" and "BAR") process a single file, could the annotatorsRef
value be encoded like this?

  <doc its:annotatorsRef="lq-issue|FOO lq-issue|BAR">

 

Section 8.12 (Provenance Data Category)

We also have several questions concerning the correct use of provenance.

 

i) Can an element have both local provenance data (either inline or via
local standoff markup) and also reference global provenance data (declared
via global standoff markup) using the attribute specified globally via
provenanceRecordsRefPointer?  The draft does not specify.

 

ii) Similarly, does the ordering of provenance records within a
<provenanceRecords> element make a statement about the (temporal) order in
which the records were created?  If an ordering is implied, it raises
questions about the implied ordering in a document where provenance records
are declared both globally and via local markup.

 

iii) More generally, we observe that provenance records lack a date/time
attribute, which makes their semantics as a form of history somewhat muddy.
In practice, a single tool/agent may edit a single document multiple times
in succession over an arbitrary period of time.  Should these multiple
"sessions" be represented by a single logical provenance record?  Or is it
the intention of the spec that the agent add a provenance record for each of
these sessions in which a modification is made to the document?

 

iv) We would also note the complexity of implementing this data category
correctly.  For example, consider an example based on Example 63.  In this
example, an XML document contains two pieces of text, each of which has been
affected by a previous tool.  A single provenance record is encoded using
global standoff notation:

 
<text xmlns:dc=" <http://purl.org/dc/elements/1.1/>
http://purl.org/dc/elements/1.1/"
  xmlns:its=" <http://www.w3.org/2005/11/its> http://www.w3.org/2005/11/its"
its:version="2.0">
  <dc:creator>John Doe</dc:creator>
  <its:provenanceRecords xml:id="pr1">
    <its:provenanceRecord
      toolRef=" <http://www.onlinemtex.com/2012/7/25/wsdl/>
http://www.onlinemtex.com/2012/7/25/wsdl/"
      org="acme-CAT-v2.3"
      revToolRef=" <http://www.mycat.com/v1.0/download>
http://www.mycat.com/v1..0/download"
      revOrg="acme-CAT-v2.3"
      provRef="
<http://www.examplelsp.com/excontent987/production/prov/e6354>
http://www.examplelsp.com/excontent987/production/prov/e6354"/>
  </its:provenanceRecords>
  <its:rules version="2.0">
    <its:provRule selector="//*[@ref]" provenanceRecordsRefPointer="@ref"/>
  </its:rules>
  <title>Translation Revision Provenance Agent: Global Test in XML</title>
  <body>
    <par ref="#pr1"> This paragraph was translated from the machine.</par>
    <legalnotice ref="#pr1">This text was also translated from the
machine.</legalnotice>
  </body>
</text>

Now, a second agent modifies the file, affecting only the <legalnotice>
content.  In this case, the shared provenance record must be forked into a
duplicate record to which the second agent can be added:

 
<text xmlns:dc=" <http://purl.org/dc/elements/1.1/>
http://purl.org/dc/elements/1.1/"
  xmlns:its=" <http://www.w3.org/2005/11/its> http://www.w3.org/2005/11/its"
its:version="2.0">
  <dc:creator>John Doe</dc:creator>
  <its:provenanceRecords xml:id="pr1">
    <its:provenanceRecord
      toolRef=" <http://www.onlinemtex.com/2012/7/25/wsdl/>
http://www.onlinemtex.com/2012/7/25/wsdl/"
      org="acme-CAT-v2.3"
      revToolRef=" <http://www.mycat.com/v1.0/download>
http://www.mycat.com/v1.0/download"
      revOrg="acme-CAT-v2.3"
      provRef="
<http://www.examplelsp.com/excontent987/production/prov/e6354>
http://www.examplelsp.com/excontent987/production/prov/e6354"/>
  </its:provenanceRecords>
  <its:provenanceRecords xml:id="pr2">
 
    <its:provenanceRecord
      toolRef=" <http://www.onlinemtex.com/2012/7/25/wsdl/>
http://www.onlinemtex.com/2012/7/25/wsdl/"
      org="acme-CAT-v2.3"
      revToolRef=" <http://www.mycat.com/v1.0/download>
http://www.mycat.com/v1..0/download"
      revOrg="acme-CAT-v2.3"
      provRef="
<http://www.examplelsp.com/excontent987/production/prov/e6354>
http://www.examplelsp.com/excontent987/production/prov/e6354"/>
 
 
 
<its:provenanceRecord
      revPerson="John Smith"
      revOrgRef=" <http://john-smith.qa.example.com/>
http://john-smith.qa.example.com"/>
 
 
 
</its:provenanceRecords>
  <its:rules version="2.0">
    <its:provRule selector="//*[@ref]" provenanceRecordsRefPointer="@ref"/>
  </its:rules>
  <title>Translation Revision Provenance Agent: Global Test in XML</title>
  <body>
    <par ref="#pr1"> This paragraph was translated from the machine.</par>
    <legalnotice ref="#pr2">This text was translated by machine and then
post-edited..</legalnotice>
  </body>
</text>

In this case, the tool would have the option of leaving the shared global
record and then using local standoff markup to encode the second record
(assuming that this combination of global & local records is permissible --
see bove).  However, there are other cases in which the agent would need to
perform complex markup manipulations, such as a scenario in which local
inline markup (encoding a single provenance record) must be replaced with
local standoff markup that contains multiple records.

 

This complexity may present a barrier to consistent implementation.  It may
be worth examining whether it's possible for a newly-created provenance
record to reference previously existing provenance records (forming a
"chain") in order to minimize the amount of markup that would need to be
rewritten by compliant implementations.

 

Thanks,

Chase Tingley & Kevin Lew

Spartan Software
Received on Thursday, 10 January 2013 10:34:54 UTC