Re: [all] call for concensus on Translation Provenance Agent (related to ISSUE-22) from Dave Lewis on 2012-07-26 (public-multilingualweb-lt@w3.org from July 2012)

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Thu, 26 Jul 2012 13:31:56 +0100
To: public-multilingualweb-lt@w3.org
Message-ID: <501138BC.702@cs.tcd.ie>
Felix,

Thanks for the comment, explanations inline:

On 26/07/2012 11:33, Felix Sasaki wrote:
> Hi Dave,
>
> you are right about the rule precedence, good point. A question about 
> the separation "transAgent" vs. "revisionAgent" in general: is it 
> important to specify the order, e.g. who did the first revision, he 
> second one etc?
>

I had deliberately restricted the semantics to just the identification 
of agents, with no temporal information. The agents referred to can be 
different types, e.g. human, software or organization, so they could 
also have operated in parallel, so attaching significance to the order 
of multi-item attribute values cannot be done unambiguously.

Ordering is however handled comprehensively in the W3C PROV model which 
we point from the standoff provenance data category. I propose to add a 
note to this data category that if timing of agent activity is a factor 
the standoff provenance should be used instead of this one.

> A few more questions about the URIs for in the "transAgentRef" and 
> "transRevisionAgentRef" attributes:
>
> 1) Do you say anything about the type of information to be expected, 
> e.g.. machine readable or human readable information? E.g. for 
> "locnote" we focus on examples with human readable information, also 
> in the "ref" attributes; but in your examples you have the "mailto" 
> scheme. How can an application know what is expected here, or do you 
> have "best practices" what kind of machine readable information should 
> be provided?
>

Yes, there we can suggest best practice on this. Briefly it depends on 
the context. Most often if the data category is used in relation to a 
commercial localization contract, the mode of agent identification 
should be defined there. It could be just a string from a value set 
specific to the contract, it could be mailto if the priority is to 
improve localization team communication, it could be a link to something 
like a vcard if the range of communication channels.

If it is used on a public setting, e.g. alongside published translation 
or  in a crowdsource translation use case, then the Ref option, with a 
public, dereferenceable IRI should be used. This works well for people 
and organizations. for software, specifically Mt engines, an IRI that 
employed the BCP47 machine translation encoding might make sense - but 
I've not had time to look at that in detail.

Shall i include such a best practice summary in the notes for the data 
category or in an annex?
> 2) In the transAgent / transAgentRef attributes, several values are 
> possible. But does it really make sense to have a transAgentRef 
> without a transAgent? Same for the revision agent. So instead of "at 
> least of the following", you could say: at least one of the following: 
> a transAgent attribute with an optional transAgentRef, or a 
> transRevisionAgent attribute with an optional transRevisionAgentRef 
> attribute.
>
My intention was that trans(Revision)Agent / trans(Revision)AgentRef are 
_alternative_ ways of allowing agents to be identified.

So it is definitely _not_ the intention that for example
transAgentRef provides additional information about an agent already 
identified for the same node using transAgent. The reason for both is 
that while a IRI ref make sense in many cases, there are others where 
the number of agents is small (the mode number of employees in LSPs is 
between 3 and 5) or where an anonymising code is used (LSPs are often 
nervous of letting client know who their translators are).

Does that address your query? If so I will clarify this in the 
definition with a statement that any node should be associated with no 
more than one means of identifying the same translation agent or 
translation revision agent.

> 3) I would also locally say that the agent and the "ref" attributes 
> MUST appear at the same node, and that the "agent" attribute is 
> mandatory. Otherwise, you run into trouble with complex inheritance 
> rules: what is overriding what?
>

i think this is address by my response to point 1
> 3) Another option to make things clearer globally would be two rules 
> elements: one <its:transAgentRule>, one <its:revisionAgentRule>, again 
> with optional "ref" attributes. That would also more directly reflect 
> the local approach.
>
I'm leaning toward this anyway for editorial reasons related to fully 
supporting the Ref&Point pattern as Yves suggested. Would this be best 
as two rules under the same data category, or two separate data 
categories even?

> 4) Is the order of the comma separated values in the attributes 
> significant, and what happens if a value is missing? In the local 
> example you have C3PO as transAgent and these URIs as 
> transAgentRef: mailto:locutus@b.org <mailto:locutus@b.org> 
> http://www.thecollective.org
> does this mean that both relate to C3PO, or for the 2nd URI, there is 
> just no transAgent given? Again, it sounds like making the agent 
> attribute mandatory and having the ref attributes optional would lower 
> the number of choices and increase interop.
>

see my previous post about the order not being significant. Hopefully my 
clarification for point (1) resolves the issue about the relationship 
between trans(Revision)Agent and trans(Revison)AgentRef

thanks for these searching comments, its really helping to clarify my 
thinking and improve the specification.
Dave

> Felix
>
> 2012/7/26 Dave Lewis <dave.lewis@cs.tcd.ie <mailto:dave.lewis@cs.tcd.ie>>
>
>     On 26/07/2012 08:01, Felix Sasaki wrote:
>>     P.S.: having just "agent" has of course the drawback that you
>>     need more rule elements to express the same information. 
>>     However, it has the benefit that you can be more specific wrt
>>     optionality of attributes: currently, all "agent" related
>>     attributes are attributes, so this
>
>     You mean all 'attributes are optional' right? Yes, that's a good
>     point. I wasn't sure about the correct formulation for this and
>     just took the lead from the rubyRule where all the attributes are
>     also optional, but you are right this leaves the meaningless
>     option of having no attribute for agent (I'm not sure if the same
>     is a problem for ruby).
>
>     Would a better formulation would be the following?
>
>       * A required *selector*
>         <#138c2c798374e560_138c2b3ceddf80a1_att.selector.attribute.selector>
>         attribute. It contains an XPath expression which selects the
>         nodes to which this rule applies.
>       * At least one of the following:
>           o A *transAgent*
>             <#138c2c798374e560_138c2b3ceddf80a1_att.local.no-ns.attribute.locNoteRef>
>             attribute that contains one or more comma separated
>             strings, each one identifying a different translation agent.
>           o A *transAgentRef*
>             <#138c2c798374e560_138c2b3ceddf80a1_att.local.no-ns.attribute.locNoteRef>
>             attribute that contains one or more space-separated IRI,
>             each referring to a resource that identifies a different
>             translation agent.
>           o A *transRevisionAgent*
>             <#138c2c798374e560_138c2b3ceddf80a1_att.local.no-ns.attribute.locNoteRef>
>             attribute that contains one or more comma separated
>             strings, each one identifying a different translation
>             revision agent.
>           o A *transRevisionAgentRef*
>             <#138c2c798374e560_138c2b3ceddf80a1_att.local.no-ns.attribute.locNoteRef>
>             attribute that contains one or more space-separated IRI,
>             each referring to a resource that identifies a different
>             translation revision agent.
>
>
>>     <its:agentRule selector="/html/body/par"/>
>>     would be legal, but doesn't make sense. If you have just the
>>     "agent" attribute and "agentRef", you can say that both (or just
>>     the former?) are mandatory - also the "agentType" attribute.
>>
>>     Felix
>>
>
>     cheers,
>     Dave
>
>
>
>>     2012/7/26 Felix Sasaki <fsasaki@w3.org <mailto:fsasaki@w3.org>>
>>
>>         Hi Dave, all,
>>
>>         About
>>
>>         "Two types of Translation Provenance Agent data categories
>>         are needed to identify:"
>>
>>         and the data category in general: wouldn't it be possible to
>>         have just two attributes "agent" and "agentRef", and an
>>         additional one "type" with the values "transAgent" or
>>         "revisionAgent"? In that they there are less attributes and
>>         also less pointer attributes (see Yves' comment). It would
>>         look like this I think:
>>
>>         <its:agentRule selector="/html/body/par"
>>         its:agentRef="http://www.onlinemtexample.com/2012/7/25/legal-v1/wsdl/"
>>          type="transAgent" />
>>
>>
>>         <its:agentRule selector="/html/body/par" agent="John Doe,
>>         acme-CAT-v2.3" type="revisionAgent"/>
>>
>>
>>         Small editorial thing: your examples above said
>>         "its:domainRule", I changed that to "agentRule".
>>
>>
>>         Another note: in ITS global rules, we always used attributes
>>         without a namespace, e.g. "agents" instead of "its:agents".
>>
>>
>>
>>         Felix
>>
>>
>>
>>         2012/7/25 Dave Lewis <dave.lewis@cs.tcd.ie
>>         <mailto:dave.lewis@cs.tcd.ie>>
>>
>>             Hi all,
>>             Given the implementation commitment to provenance and the
>>             previous posting on this subject,
>>             http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0161.html
>>             please find attached the proposed specification for the
>>             Translation Provenance Agent plus the example files.
>>
>>             As a reminder, and as discussed in the original post and
>>             mentioned at the last WG call, provenance covers two
>>             essentially independent approaches: agent provenance,
>>             (which is this one), and standoff provenance, which we
>>             are treating as two individual data categories. I will
>>             send on the standoff provenance call for concensus shortly.
>>
>>             Regards,
>>             Dave
>>
>>
>>
>>
>>
>>         -- 
>>         Felix Sasaki
>>         DFKI / W3C Fellow
>>
>>
>>
>>
>>     -- 
>>     Felix Sasaki
>>     DFKI / W3C Fellow
>>
>
>
>
>
> -- 
> Felix Sasaki
> DFKI / W3C Fellow
>
Received on Thursday, 26 July 2012 12:32:07 UTC