RE: [ISSUE-22] Provenance and Agents

Hello Felix.

 

I use it as a single rule now for the translator and revisor:

<its:transProvRule selector="//item" transOrg="Linguaserve"
transPerson="11236" transRevOrg="Linguaserve" transRevPerson="11239"/>

This way I point to the translated and revised nodes with only one rule.

The potential translated nodes would be all the <item> nodes of the xml.

 

And related with the global rules issue, I think this way is a more
neat/elegant way to say it, when possible, than populating the document with
local rules. But I’m not against it either.

 

Cheers.

__________________________________

Mauricio del Olmo Martínez

Dpto. Técnico/I+D+i

Linguaserve Internacionalización de Servicios, S.A.

Tel.: +34 91 761 64 60 ext. 0421
Fax: +34 91 542 89 28 

E-mail:  <mailto:tecnico@linguaserve.com> tecnico@linguaserve.com

www.linguaserve.com <http://www.linguaserve.com/> 

 

«En cumplimiento con lo previsto con los artículos 21 y 22 de la Ley
34/2002, de 11 de julio, de Servicios de la Sociedad de Información y
Comercio Electrónico, le informamos que procederemos al archivo y
tratamiento de sus datos exclusivamente con fines de promoción de los
productos y servicios ofrecidos por LINGUASERVE INTERNACIONALIZACIÓN DE
SERVICIOS, S.A. En caso de que Vdes. no deseen que procedamos al archivo y
tratamiento de los datos proporcionados, o no deseen recibir comunicaciones
comerciales sobre los productos y servicios ofrecidos, comuníquenoslo a
clients@linguaserve.com, y su petición será inmediatamente cumplida.»

 

"According to the provisions set forth in articles 21 and 22 of Law 34/2002
of July 11 regarding Information Society and eCommerce Services, we will
store and use your personal data with the sole purpose of marketing the
products and services offered by LINGUASERVE INTERNACIONALIZACIÓN DE
SERVICIOS, S.A. If you do not wish your personal data to be stored and
handled, or you do not wish to receive further information regarding
products and services offered by our company, please e-mail us to
clients@linguaserve.com. Your request will be processed immediately.”

__________________________________

 

De: Felix Sasaki [mailto:fsasaki@w3.org] 
Enviado el: martes, 23 de octubre de 2012 19:27
Para: public-multilingualweb-lt@w3.org
Asunto: Re: [ISSUE-22] Provenance and Agents

 

Hi all,

 

this may have been lost during conference / travel etc. Any thoughts on
this? Also for the implementors: is everybody fine with implementing this
single "translation provenance" data category?


Thanks,

 

Felix

2012/10/18 Felix Sasaki <fsasaki@w3.org>

Hi Dave, Yves, all,

 

Dave, Yves and I had a discussion at the FEISGILLT event about provenance,
and I updated the section at

http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#t
ranslation-agent-provenance

with the idea that this data category should cover all three types of
provenance: translation, revision, RDF-based standoff. The mechanism is
copied from quality issue.

 

Comments welcome,

 

Felix

 

2012/10/15 Yves Savourel <ysavourel@enlaso.com>

Hi Felix, Dave, all,

 

Felix: I think there is a difference in the way you use transProvRef and the
way locQualityIssuesRef is currently defined. You use a list of URIs for
transProvRef while locQualityIssuesRef defines a single URI that points to a
set of issues.

 

To have both data categories be similar, you would have to have transProvref
to point to a translationProvenanceRecords with one or more records. So in
your example, two translationProvenanceRecords elements (one for each of the
transProvRef).

 

But I agree that a similar stand-off structure could be used for both.

 

Cheers,

-yves

 

 

From: Felix Sasaki [mailto:fsasaki@w3.org] 
Sent: Sunday, October 14, 2012 11:22 AM
To: Dave Lewis
Cc: public-multilingualweb-lt@w3.org
Subject: Re: [ISSUE-22] Provenance and Agents

 

Hi Dave, all,

 

I added the translation provenance agent to

http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#t
ranslation-agent-provenance

with a big warning that this is in an early stage. I changed a few things
from your draft:

 

- XPath expressions in pointer attributes in the example:  these were quite
general; e.g. //dc:creator selects all "dc:creator" elements in the
document. Esp. given the discussion we just have here 

http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0179.h
tml

this seems to be too general

 

- XPath expression in the selector, e.g. "selector="/html/body/legalnotice""
> "selector="/text/body/legalnotice""

I changed "/html/body/par" to "/text/body/par[1]", so that here only the
first "par" element is selected. I realized here again that we haven't
resolved the "tool many global rules" issue. Dave, can you take up this
thread

http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0093.h
tml

Because depending on the outcome both provenance and many other data
categories might change a lot

 

- I removed local XPath expressions, e.g. transToolPointer or
transToolRefPointer attributes. We don't have local XPath - that has been
discussed several times. If needed I can dig up the threads again, but it
would save a lot of time if we could just agree on this. 

 

- I changed the local example. What you tried in the local example was a
combination of global and local provenance information. But that doesn't
work: we said now several times that overriding is always complete. So you
cannot "through a local selection overriding part of the global rule.". You
will override the complete rule. It doesn't matter whether the local
attributes are in HTML5 or in XML, that doesn't change overriding.

 

In general I'm quite frustrated about the data category. The issue is not
the pieces of information itself; what you specify (person, organization,
tools) makes a lot of sense. The issue is that obviously the specification
is not implementation driven, as can be seen by the non tested XPath
expressions and the overriding that wouldn't work, even with a conformance
only processor.

 

The other frustration comes from the speed and continuation of progress: to
wrap this up we need a continuous discussion. So my main question is: will
you and Phil have time to engage in this by the end of November, that is
within the last call period? Or: can we engage somebody else interested in
implementing this?

 

Now, about the data category in general ...

 

I think what you are trying to achieve is:

conveying several pieces of provenance information for agents:

initial revision = translation agent provenance;

subsequent revision = translation revision agent provenance;

complex revision information: standoff provenance.

 

We may have a similar picture like with quality issue: the complexity of
this information might be better dealt with a standoff approach. I am not
talking about the standoff approach in your example, Dave, but something
like this:

 

[

<text xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:its="http://www.w3.org/2005/11/its"
    its:version="2.0">
    <head>
        <dc:creator>John Doe</dc:creator>
        <title>Translation Revision Provenance Agent: Global Test in
XML</title>
        <its:translationProvenanceRecords>
            <its:translationProvenanceRecord xml:id="tp1"
                transToolRef="http://www.onlinemtex.com/2012/7/25/wsdl/"
transOrg="acme-CAT-v2.3"/>
            <its:translationProvenanceRecord xml:id="tp2" transPerson="John
Doe"
                transOrgRef="http://www.legaltrans-ex.com/"/>
            <its:translationProvenanceRecord xml:id="tp3" transPerson="Carl
Meyer"
                transOrgRef="http://www.mytranslations.example.com/"/>
            <its:translationProvenanceRecord xml:id="tp4"
provRef="http://www.examplemtservice.com/prov/e76547"/>
        </its:translationProvenanceRecords>
    </head>
    <body>
        <par its:transProvRef="#tp1"> This paragraph was translated from the
machine.</par>
        <legalnotice postediting-by="http://www.vistatec.com/"
its:transProvRef="#tp2 #tp3 #tp4">This text was
            translated directly by a person.</legalnotice>
    </body>
</text>

]

 

The interaction between "its:translationProvenanceRecords" and the local
its:transProvRef attribute is identical to "its:locQualityIssues" and
"its:locQualityIssuesRef" attribute.

 

In its:translationProvenanceRecords you have a list of
"its:translationProvenanceRecord" elements. Each element has an "xml:id"
attribute. We could say that the order of "its:translationProvenanceRecord"
specifies whether this is translation agent provenance or revision agent
provenance information. Or we could say that this is specified by the order
of the values in "its:transProfRev". ”Your" standoff data category could be
accommodated by <its:translationProvenanceRecord xml:id="tp4"
provRef="http://www.examplemtservice.com/prov/e76547"/>.

 

You seem to have the use case of attaching several pieces of provenance
information to the same node. With the ITS overriding that is not possible.
But with the above approach tools can still do that, locally:

- first tool creates

<legalnotice postediting-by="http://www.vistatec.com/"
its:transProvRef="#tp2">This text was
            translated directly by a person.</legalnotice>

- second tool creates

<legalnotice postediting-by="http://www.vistatec.com/"
its:transProvRef="#tp2 #tp3">This text was
            translated directly by a person.</legalnotice>

- third tool creates

<legalnotice postediting-by="http://www.vistatec.com/"
its:transProvRef="#tp2 #tp3 #tp4">This text was
            translated directly by a person.</legalnotice>

 

This all works without global "adding" rules (but keeping the pointer
attributes in global rules). We just need guidance for the tool developers
how to attach such complex pieces of information.

 

Also, for the simple local case we could still have 

<legalnotice postediting-by="http://www.vistatec.com/" its:transPerson="John
Doe"
                its:transOrgRef="http://www.legaltrans-ex.com/"
its:provRef="http://www.examplemtservice.com/prov/e76547">This text was
translated directly by a person.</legalnotice>

 

But would say that you either have local markup or the external record, not
both.

 

So in summary, above proposal would mean

- have only one provenance data category

- realize the need of specifying initial translation provenance, revision
and standoff provenance at the same time like this: having lq issue like
standoff elements

- realize the need of providing several pieces of information via several
references to provenance records, e.g. its:transProvRef="#tp2 #tp3"

- have global rules only for pointing, see the other thread.

 

Best,

 

Felix

 

2012/10/12 Dave Lewis <dave.lewis@cs.tcd.ie>

Hi All,
Please find attached updates to the provenance related data categories ready
to be included in the draft. Many thanks to Phil for reviewing these in
detail.

There are three separate data categories:
- Translation Agent Provenance: which record machines, people and
organsiations responsible for translating the selected text

- Translation Agent Provenance: which records machines, people and
organsiations responsible for revising the translation the selected text
(e.g. from posteding or linguistic review)

- Standoff Provenance: which provides a link to standoff provenance record
using the W3C PROV standard.

Comments welcome.

Regards,
Dave
 


- 





 

-- 
Felix Sasaki

DFKI / W3C Fellow

 





 

-- 
Felix Sasaki

DFKI / W3C Fellow

 





 

-- 
Felix Sasaki

DFKI / W3C Fellow

 

Received on Tuesday, 23 October 2012 18:08:26 UTC