W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > October 2012

Re: [ISSUE-22] Provenance and Agents

From: Felix Sasaki <fsasaki@w3.org>
Date: Tue, 23 Oct 2012 19:27:17 +0200
Message-ID: <CAL58czoYF2J0JJP0BRMTo7bCrfu9pXZE6vkVfTGiZB8kcMkJMQ@mail.gmail.com>
To: public-multilingualweb-lt@w3.org
Hi all,

this may have been lost during conference / travel etc. Any thoughts on
this? Also for the implementors: is everybody fine with implementing this
single "translation provenance" data category?

Thanks,

Felix

2012/10/18 Felix Sasaki <fsasaki@w3.org>

> Hi Dave, Yves, all,
>
> Dave, Yves and I had a discussion at the FEISGILLT event about provenance,
> and I updated the section at
>
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#translation-agent-provenance
> with the idea that this data category should cover all three types of
> provenance: translation, revision, RDF-based standoff. The mechanism is
> copied from quality issue.
>
> Comments welcome,
>
> Felix
>
>
> 2012/10/15 Yves Savourel <ysavourel@enlaso.com>
>
>> Hi Felix, Dave, all,****
>>
>> ** **
>>
>> Felix: I think there is a difference in the way you use transProvRef and
>> the way locQualityIssuesRef is currently defined. You use a list of URIs
>> for transProvRef while locQualityIssuesRef defines a single URI that points
>> to a set of issues.****
>>
>> ** **
>>
>> To have both data categories be similar, you would have to have
>> transProvref to point to a translationProvenanceRecords with one or more
>> records. So in your example, two translationProvenanceRecords elements (one
>> for each of the transProvRef).****
>>
>> ** **
>>
>> But I agree that a similar stand-off structure could be used for both.***
>> *
>>
>> ** **
>>
>> Cheers,****
>>
>> -yves****
>>
>> ****
>>
>> ** **
>>
>> ** **
>>
>> *From:* Felix Sasaki [mailto:fsasaki@w3.org]
>> *Sent:* Sunday, October 14, 2012 11:22 AM
>> *To:* Dave Lewis
>> *Cc:* public-multilingualweb-lt@w3.org
>> *Subject:* Re: [ISSUE-22] Provenance and Agents****
>>
>> ** **
>>
>> Hi Dave, all,****
>>
>> ** **
>>
>> I added the translation provenance agent to****
>>
>>
>> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#translation-agent-provenance
>> ****
>>
>> with a big warning that this is in an early stage. I changed a few things
>> from your draft:****
>>
>> ** **
>>
>> - XPath expressions in pointer attributes in the example:  these were
>> quite general; e.g. //dc:creator selects all "dc:creator" elements in the
>> document. Esp. given the discussion we just have here ****
>>
>>
>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0179.html
>> ****
>>
>> this seems to be too general****
>>
>> ** **
>>
>> - XPath expression in the selector, e.g.
>> "selector="/html/body/legalnotice"" > "selector="/text/body/legalnotice""
>> ****
>>
>> I changed "/html/body/par" to "/text/body/par[1]", so that here only the
>> first "par" element is selected. I realized here again that we haven't
>> resolved the "tool many global rules" issue. Dave, can you take up this
>> thread****
>>
>>
>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0093.html
>> ****
>>
>> Because depending on the outcome both provenance and many other data
>> categories might change a lot****
>>
>> ** **
>>
>> - I removed local XPath expressions, e.g. transToolPointer or
>> transToolRefPointer attributes. We don't have local XPath - that has been
>> discussed several times. If needed I can dig up the threads again, but it
>> would save a lot of time if we could just agree on this. ****
>>
>> ** **
>>
>> - I changed the local example. What you tried in the local example was a
>> combination of global and local provenance information. But that doesn't
>> work: we said now several times that overriding is always complete. So you
>> cannot "through a local selection overriding part of the global rule.". You
>> will override the complete rule. It doesn't matter whether the local
>> attributes are in HTML5 or in XML, that doesn't change overriding.****
>>
>> ** **
>>
>> In general I'm quite frustrated about the data category. The issue is not
>> the pieces of information itself; what you specify (person, organization,
>> tools) makes a lot of sense. The issue is that obviously the specification
>> is not implementation driven, as can be seen by the non tested XPath
>> expressions and the overriding that wouldn't work, even with a conformance
>> only processor.****
>>
>> ** **
>>
>> The other frustration comes from the speed and continuation of progress:
>> to wrap this up we need a continuous discussion. So my main question is:
>> will you and Phil have time to engage in this by the end of November, that
>> is within the last call period? Or: can we engage somebody else interested
>> in implementing this?****
>>
>> ** **
>>
>> Now, about the data category in general ...****
>>
>> ** **
>>
>> I think what you are trying to achieve is:****
>>
>> conveying several pieces of provenance information for agents:****
>>
>> initial revision = translation agent provenance;****
>>
>> subsequent revision = translation revision agent provenance;****
>>
>> complex revision information: standoff provenance.****
>>
>> ** **
>>
>> We may have a similar picture like with quality issue: the complexity of
>> this information might be better dealt with a standoff approach. I am not
>> talking about the standoff approach in your example, Dave, but something
>> like this:****
>>
>> ** **
>>
>> [****
>>
>> <text xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:its="
>> http://www.w3.org/2005/11/its"
>>     its:version="2.0">
>>     <head>
>>         <dc:creator>John Doe</dc:creator>
>>         <title>Translation Revision Provenance Agent: Global Test in XML
>> </title>
>>         <its:translationProvenanceRecords>
>>             <its:translationProvenanceRecord xml:id="tp1"
>>                 transToolRef="http://www.onlinemtex.com/2012/7/25/wsdl/"transOrg
>> ="acme-CAT-v2.3"/>
>>             <its:translationProvenanceRecord xml:id="tp2" transPerson="John
>> Doe"
>>                 transOrgRef="http://www.legaltrans-ex.com/"/>
>>             <its:translationProvenanceRecord xml:id="tp3" transPerson="Carl
>> Meyer"
>>                 transOrgRef="http://www.mytranslations.example.com/"/>
>>             <its:translationProvenanceRecord xml:id="tp4" provRef="
>> http://www.examplemtservice.com/prov/e76547"/>
>>         </its:translationProvenanceRecords>
>>     </head>
>>     <body>
>>         <par its:transProvRef="#tp1"> This paragraph was translated from
>> the machine.</par>
>>         <legalnotice postediting-by="http://www.vistatec.com/"its:transProvRef
>> ="#tp2 #tp3 #tp4">This text was
>>             translated directly by a person.</legalnotice>
>>     </body>
>> </text>****
>>
>> ]****
>>
>> ** **
>>
>> The interaction between "its:translationProvenanceRecords" and the
>> local its:transProvRef attribute is identical to "its:locQualityIssues" and
>> "its:locQualityIssuesRef" attribute.****
>>
>> ** **
>>
>> In its:translationProvenanceRecords you have a list of
>> "its:translationProvenanceRecord" elements. Each element has an "xml:id"
>> attribute. We could say that the order of "its:translationProvenanceRecord"
>> specifies whether this is translation agent provenance or revision agent
>> provenance information. Or we could say that this is specified by the order
>> of the values in "its:transProfRev". ”Your" standoff data category could be
>> accommodated by <its:translationProvenanceRecord xml:id="tp4" provRef="
>> http://www.examplemtservice.com/prov/e76547"/>.****
>>
>> ** **
>>
>> You seem to have the use case of attaching several pieces of provenance
>> information to the same node. With the ITS overriding that is not possible.
>> But with the above approach tools can still do that, locally:****
>>
>> - first tool creates****
>>
>> <legalnotice postediting-by="http://www.vistatec.com/" its:transProvRef=
>> "#tp2">This text was
>>             translated directly by a person.</legalnotice>****
>>
>> - second tool creates****
>>
>> <legalnotice postediting-by="http://www.vistatec.com/" its:transProvRef="#tp2
>> #tp3">This text was
>>             translated directly by a person.</legalnotice>****
>>
>> - third tool creates****
>>
>> <legalnotice postediting-by="http://www.vistatec.com/" its:transProvRef="#tp2
>> #tp3 #tp4">This text was
>>             translated directly by a person.</legalnotice>****
>>
>> ** **
>>
>> This all works without global "adding" rules (but keeping the pointer
>> attributes in global rules). We just need guidance for the tool developers
>> how to attach such complex pieces of information.****
>>
>> ** **
>>
>> Also, for the simple local case we could still have ****
>>
>> <legalnotice postediting-by="http://www.vistatec.com/" its:transPerson="John
>> Doe"
>>                 its:transOrgRef="http://www.legaltrans-ex.com/" its:
>> provRef="http://www.examplemtservice.com/prov/e76547">This text
>> was translated directly by a person.</legalnotice>****
>>
>> ** **
>>
>> But would say that you either have local markup or the external record,
>> not both.****
>>
>> ** **
>>
>> So in summary, above proposal would mean****
>>
>> - have only one provenance data category****
>>
>> - realize the need of specifying initial translation provenance, revision
>> and standoff provenance at the same time like this: having lq issue like
>> standoff elements****
>>
>> - realize the need of providing several pieces of information via several
>> references to provenance records, e.g. its:transProvRef="#tp2 #tp3"****
>>
>> - have global rules only for pointing, see the other thread.****
>>
>> ** **
>>
>> Best,****
>>
>> ** **
>>
>> Felix****
>>
>> ** **
>>
>> 2012/10/12 Dave Lewis <dave.lewis@cs.tcd.ie>****
>>
>> Hi All,
>> Please find attached updates to the provenance related data categories
>> ready to be included in the draft. Many thanks to Phil for reviewing these
>> in detail.
>>
>> There are three separate data categories:
>> - Translation Agent Provenance: which record machines, people and
>> organsiations responsible for translating the selected text
>>
>> - Translation Agent Provenance: which records machines, people and
>> organsiations responsible for revising the translation the selected text
>> (e.g. from posteding or linguistic review)
>>
>> - Standoff Provenance: which provides a link to standoff provenance
>> record using the W3C PROV standard.
>>
>> Comments welcome.
>>
>> Regards,
>> Dave
>>
>>
>>
>> - ****
>>
>>
>>
>> ****
>>
>> ** **
>>
>> --
>> Felix Sasaki****
>>
>> DFKI / W3C Fellow****
>>
>> ** **
>>
>
>
>
> --
> Felix Sasaki
> DFKI / W3C Fellow
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow
Received on Tuesday, 23 October 2012 17:27:51 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:56 UTC