W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > October 2012

Re: [ISSUE-22] Provenance and Agents

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Thu, 25 Oct 2012 23:03:56 +0100
Message-ID: <5089B74C.9070303@cs.tcd.ie>
To: public-multilingualweb-lt@w3.org
Hi Yves,
I don't see immediately why not. 'Items' is nice and neutral

But would it not be easier to retain the item element name from the 
different data categories? e.g.

<its:items id="1">
      <its:locQualityIssue locQualityIssueType="misspelling"
             locQualityIssueComment="'c'es' is unknown. Could be 'c'est'"
             locQualityIssueSeverity="50"/>
      <its:translationProvenanceRecord
transToolRef="http://www.onlinemtex.com/2012/7/25/wsdl/"
             transOrg="acme-CAT-v2.3"
             transRevToolRef="http://www.mycat.com/v1.0/download"
             transRevOrg="acme-CAT-v2.3"
provRef="http://www.examplelsp.com/excontent987/productio/prov/e6354"/>
</its:items>

Two other quick questions about standoff markup:
1) in Localization Quality issue, where the locQualityIssue element is 
defined under the GLOBAL definition there is a editor note saying
"[Ed. note: Should locQualityIssues also be defined for global rules? It 
seems not to be specific to local.]"

the answer is surely yes as
a) |locQualityIssues|is referred to also from the global definition and
b) because the need to support more than one set of the same data 
category attribute for the same node applies regardless of how it is 
selected.

Would the best solution is therefore to define it before both GLOBAL and 
LOCAL definitions?

Also, if we define a shared item element, then that will need to be 
specified outside of any specific data category.

Also, in a related comment in provenance, just before ex 61:
"[Ed. note: Not sure if we need the standoff version globally. We don't 
have it with quality either. Thoughts?]"

The answer is yes we do want standoff  globally for the same reasons as 
for localisationQualityIssues

2)  In both locqualissue and proveance we allow locQualityIssuesRef and
translationProvenanceRecordsRef
to be used along side the 'regular' attributes, e.g. 
LocQualityType|Comment|Severity|Profile|Enabled

So if the selection (global or local) is annotated by n sets of these 
attributes, it can be done either by having one included in the global 
rule or local element and n-1 in the stand off or none in the global 
rule or local element and n int eh standoff. There is no significance to 
which set go where, so it there any advantage to supporting both these 
options?

Would it make the parsing easier just to make these stand off ref 
mutually exclusive to the use of the annotation in the global rule or 
local element - so only one option need be handled for mutiple attribute 
sets?

e.g global rule wording would be:

  *

    A required|selector|attribute. It contains anabsolute selector
    <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#selectors>which
    selects the nodes to which this rule applies.

*_Exactly one of the following_*

  *

    _*Exactly one of the following:*_

      o

        A|locQualityIssuesRef|attribute. Its value is a URI pointing to
        the|locQualityIssues|element containing the list of issues
        related to this content.

      o

        A|locQualityIssuesRefPointer|attribute that contains arelative
        selector
        <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#selectors>pointing
        to a node with the exact same semantics as|locQualityIssuesRef|.

  * _*the following*_

      *

        At least one of the following:

          o

            Exactly one of the following:

              +

                A|locQualityIssueType|attribute that implements thetype
                information
                <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#lqissueDefs>.

              +

                A|locQualityIssueTypePointer|attribute that contains
                arelative selector
                <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#selectors>pointing
                to a node with the exact same semantics
                as|locQualityIssueType|.

          o

            Exactly one of the following:

              +

                A|locQualityIssueComment|attribute that implements
                thecomment information
                <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#lqissueDefs>.

              +

                A|locQualityIssueCommentPointer|attribute that contains
                arelative selector
                <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#selectors>pointing
                to a node with the exact same semantics
                as|locQualityIssueComment|.

      *

        None or exactly one of the following:

          o

            A|locQualityIssueSeverity|attribute that implements
            theseverity information
            <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#lqissueDefs>.

          o

            A|locQualityIssueSeverityPointer|attribute that contains
            arelative selector
            <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#selectors>pointing
            to a node with the exact same semantics
            as|locQualityIssueSeverity|.

      *

        None or exactly one of the following:

          o

            A|locQualityIssueProfileRef|attribute that implements
            theprofile reference information
            <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#lqissueDefs>.

          o

            A|locQualityIssueProfileRefPointer|attribute that contains
            arelative selector
            <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#selectors>pointing
            to a node with the exact same semantics
            as|locQualityIssueProfileRef|.

      *

        None or exactly one of the following:

          o

            A|locQualityIssueEnabled|attribute that implements
            theenabled information
            <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#lqissueDefs>.

          o

            A|locQualityIssueEnabledPointer|attribute that contains
            arelative selector
            <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#selectors>pointing
            to a node with the exact same semantics
            as|locQualityIssueEnabled|.

cheers,
Dave

On 23/10/2012 20:45, Yves Savourel wrote:
>
> I have only one comment:
>
> When using the Translation Agent provenance stand-off notation, could 
> we possibly use the same item-container and item elements for other 
> data categories? That is, re-use common elements for Translation Agent 
> Provenance and Localization Quality Issue, for example.
>
> <its:items xml:id="1">
>
> <its:item itsXYZ…/>
>
> </its:items>
>
> The name could be records/record, or items/item, etc. it doesn’t 
> matter. But we would re-use it in all stand-off cases.
>
> -yves
>
> *From:*Dave Lewis [mailto:dave.lewis@cs.tcd.ie]
> *Sent:* Tuesday, October 23, 2012 12:00 PM
> *To:* public-multilingualweb-lt@w3.org
> *Subject:* Re: [ISSUE-22] Provenance and Agents
>
> Hi Felix,
> In general this integration is a good move. The idea of using the 
> standoff  list pattern from quality issue works well, and solves some 
> of the issue that required separate translation and 
> translationRevision data categories - so we may be able to consolidate 
> the spec further now.
>
> Dom is going through this in detail currently, and we will get back 
> with some specific comments shortly.
>
> we really appreciate you putting this together,
> Dave
>
> On 23/10/2012 18:27, Felix Sasaki wrote:
>
>     Hi all,
>
>     this may have been lost during conference / travel etc. Any
>     thoughts on this? Also for the implementors: is everybody fine
>     with implementing this single "translation provenance" data category?
>
>
>     Thanks,
>
>     Felix
>
>     2012/10/18 Felix Sasaki <fsasaki@w3.org <mailto:fsasaki@w3.org>>
>
>     Hi Dave, Yves, all,
>
>     Dave, Yves and I had a discussion at the FEISGILLT event about
>     provenance, and I updated the section at
>
>     http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#translation-agent-provenance
>
>     with the idea that this data category should cover all three types
>     of provenance: translation, revision, RDF-based standoff. The
>     mechanism is copied from quality issue.
>
>     Comments welcome,
>
>     Felix
>
>     2012/10/15 Yves Savourel <ysavourel@enlaso.com
>     <mailto:ysavourel@enlaso.com>>
>
>     Hi Felix, Dave, all,
>
>     Felix: I think there is a difference in the way you use
>     transProvRef and the way locQualityIssuesRef is currently defined.
>     You use a list of URIs for transProvRef while locQualityIssuesRef
>     defines a single URI that points to a set of issues.
>
>     To have both data categories be similar, you would have to have
>     transProvref to point to a translationProvenanceRecords with one
>     or more records. So in your example, two
>     translationProvenanceRecords elements (one for each of the
>     transProvRef).
>
>     But I agree that a similar stand-off structure could be used for both.
>
>     Cheers,
>
>     -yves
>
>     *From:*Felix Sasaki [mailto:fsasaki@w3.org <mailto:fsasaki@w3.org>]
>     *Sent:* Sunday, October 14, 2012 11:22 AM
>     *To:* Dave Lewis
>     *Cc:* public-multilingualweb-lt@w3.org
>     <mailto:public-multilingualweb-lt@w3.org>
>     *Subject:* Re: [ISSUE-22] Provenance and Agents
>
>     Hi Dave, all,
>
>     I added the translation provenance agent to
>
>     http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#translation-agent-provenance
>
>     with a big warning that this is in an early stage. I changed a few
>     things from your draft:
>
>     - XPath expressions in pointer attributes in the example:  these
>     were quite general; e.g. //dc:creator selects all "dc:creator"
>     elements in the document. Esp. given the discussion we just have here
>
>     http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0179.html
>
>     this seems to be too general
>
>     - XPath expression in the selector, e.g.
>     "selector="/html/body/legalnotice"" >
>     "selector="/text/body/legalnotice""
>
>     I changed "/html/body/par" to "/text/body/par[1]", so that here
>     only the first "par" element is selected. I realized here again
>     that we haven't resolved the "tool many global rules" issue. Dave,
>     can you take up this thread
>
>     http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0093.html
>
>     Because depending on the outcome both provenance and many other
>     data categories might change a lot
>
>     - I removed local XPath expressions, e.g. transToolPointer or
>     transToolRefPointer attributes. We don't have local XPath - that
>     has been discussed several times. If needed I can dig up the
>     threads again, but it would save a lot of time if we could just
>     agree on this.
>
>     - I changed the local example. What you tried in the local example
>     was a combination of global and local provenance information. But
>     that doesn't work: we said now several times that overriding is
>     always complete. So you cannot "through a local selection
>     overriding part of the global rule.". You will override the
>     complete rule. It doesn't matter whether the local attributes are
>     in HTML5 or in XML, that doesn't change overriding.
>
>     In general I'm quite frustrated about the data category. The issue
>     is not the pieces of information itself; what you specify (person,
>     organization, tools) makes a lot of sense. The issue is that
>     obviously the specification is not implementation driven, as can
>     be seen by the non tested XPath expressions and the overriding
>     that wouldn't work, even with a conformance only processor.
>
>     The other frustration comes from the speed and continuation of
>     progress: to wrap this up we need a continuous discussion. So my
>     main question is: will you and Phil have time to engage in this by
>     the end of November, that is within the last call period? Or: can
>     we engage somebody else interested in implementing this?
>
>     Now, about the data category in general ...
>
>     I think what you are trying to achieve is:
>
>     conveying several pieces of provenance information for agents:
>
>     initial revision = translation agent provenance;
>
>     subsequent revision = translation revision agent provenance;
>
>     complex revision information: standoff provenance.
>
>     We may have a similar picture like with quality issue: the
>     complexity of this information might be better dealt with a
>     standoff approach. I am not talking about the standoff approach in
>     your example, Dave, but something like this:
>
>     [
>
>     <text xmlns:dc="http://purl.org/dc/elements/1.1/"
>     xmlns:its="http://www.w3.org/2005/11/its"
>         its:version="2.0">
>         <head>
>             <dc:creator>John Doe</dc:creator>
>             <title>Translation Revision Provenance Agent: Global Test
>     in XML</title>
>     <its:translationProvenanceRecords>
>     <its:translationProvenanceRecord xml:id="tp1"
>                    
>     transToolRef="http://www.onlinemtex.com/2012/7/25/wsdl/"
>     transOrg="acme-CAT-v2.3"/>
>     <its:translationProvenanceRecord xml:id="tp2" transPerson="John Doe"
>                     transOrgRef="http://www.legaltrans-ex.com/"/>
>     <its:translationProvenanceRecord xml:id="tp3" transPerson="Carl Meyer"
>                     transOrgRef="http://www.mytranslations.example.com/"/>
>     <its:translationProvenanceRecord xml:id="tp4"
>     provRef="http://www.examplemtservice.com/prov/e76547"/>
>     </its:translationProvenanceRecords>
>         </head>
>         <body>
>             <par its:transProvRef="#tp1"> This paragraph was
>     translated from the machine.</par>
>             <legalnotice postediting-by="http://www.vistatec.com/"
>     its:transProvRef="#tp2 #tp3 #tp4">This text was
>                 translated directly by a person.</legalnotice>
>         </body>
>     </text>
>
>     ]
>
>     The interaction between "its:translationProvenanceRecords" and the
>     local its:transProvRef attribute is identical to
>     "its:locQualityIssues" and "its:locQualityIssuesRef" attribute.
>
>     In its:translationProvenanceRecords you have a list of
>     "its:translationProvenanceRecord" elements. Each element has an
>     "xml:id" attribute. We could say that the order of
>     "its:translationProvenanceRecord" specifies whether this is
>     translation agent provenance or revision agent provenance
>     information. Or we could say that this is specified by the order
>     of the values in "its:transProfRev". ”Your" standoff data category
>     could be accommodated
>     by <its:translationProvenanceRecord xml:id="tp4" provRef="http://www.examplemtservice.com/prov/e76547"/>.
>
>     You seem to have the use case of attaching several pieces of
>     provenance information to the same node. With the ITS overriding
>     that is not possible. But with the above approach tools can still
>     do that, locally:
>
>     - first tool creates
>
>     <legalnotice postediting-by="http://www.vistatec.com/" its:transProvRef="#tp2">This
>     text was
>                 translated directly by a person.</legalnotice>
>
>     - second tool creates
>
>     <legalnotice postediting-by="http://www.vistatec.com/" its:transProvRef="#tp2
>     #tp3">This text was
>                 translated directly by a person.</legalnotice>
>
>     - third tool creates
>
>     <legalnotice postediting-by="http://www.vistatec.com/" its:transProvRef="#tp2
>     #tp3 #tp4">This text was
>                 translated directly by a person.</legalnotice>
>
>     This all works without global "adding" rules (but keeping the
>     pointer attributes in global rules). We just need guidance for the
>     tool developers how to attach such complex pieces of information.
>
>     Also, for the simple local case we could still have
>
>     <legalnotice postediting-by="http://www.vistatec.com/" its:transPerson="John
>     Doe"
>                     its:transOrgRef="http://www.legaltrans-ex.com/"
>     its:provRef="http://www.examplemtservice.com/prov/e76547">This
>     text was translated directly by a person.</legalnotice>
>
>     But would say that you either have local markup or the external
>     record, not both.
>
>     So in summary, above proposal would mean
>
>     - have only one provenance data category
>
>     - realize the need of specifying initial translation provenance,
>     revision and standoff provenance at the same time like this:
>     having lq issue like standoff elements
>
>     - realize the need of providing several pieces of information via
>     several references to provenance records,
>     e.g. its:transProvRef="#tp2 #tp3"
>
>     - have global rules only for pointing, see the other thread.
>
>     Best,
>
>     Felix
>
>     2012/10/12 Dave Lewis <dave.lewis@cs.tcd.ie
>     <mailto:dave.lewis@cs.tcd.ie>>
>
>     Hi All,
>     Please find attached updates to the provenance related data
>     categories ready to be included in the draft. Many thanks to Phil
>     for reviewing these in detail.
>
>     There are three separate data categories:
>     - Translation Agent Provenance: which record machines, people and
>     organsiations responsible for translating the selected text
>
>     - Translation Agent Provenance: which records machines, people and
>     organsiations responsible for revising the translation the
>     selected text (e.g. from posteding or linguistic review)
>
>     - Standoff Provenance: which provides a link to standoff
>     provenance record using the W3C PROV standard.
>
>     Comments welcome.
>
>     Regards,
>     Dave
>
>
>
>     -
>
>
>
>     -- 
>     Felix Sasaki
>
>     DFKI / W3C Fellow
>
>
>
>     -- 
>     Felix Sasaki
>
>     DFKI / W3C Fellow
>
>
>
>     -- 
>     Felix Sasaki
>
>     DFKI / W3C Fellow
>
Received on Thursday, 25 October 2012 22:04:34 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:56 UTC