Re: [all] call for concensus on Translation Provenance Agent (related to ISSUE-22) from Felix Sasaki on 2012-07-26 (public-multilingualweb-lt@w3.org from July 2012)

From: Felix Sasaki <fsasaki@w3.org>
Date: Thu, 26 Jul 2012 14:47:24 +0200
To: Dave Lewis <dave.lewis@cs.tcd.ie>
Cc: public-multilingualweb-lt@w3.org
Message-ID: <CAL58czogK9WUGiPk3wq-t6nZTb5ATf20vEQGBuE0+TdKxRSA-w@mail.gmail.com>
Hi Dave,

2012/7/26 Dave Lewis <dave.lewis@cs.tcd.ie>

>  Felix,
>
> Thanks for the comment, explanations inline:
>
>
> On 26/07/2012 11:33, Felix Sasaki wrote:
>
> Hi Dave,
>
>  you are right about the rule precedence, good point. A question about
> the separation "transAgent" vs. "revisionAgent" in general: is it important
> to specify the order, e.g. who did the first revision, he second one etc?
>
>
> I had deliberately restricted the semantics to just the identification of
> agents, with no temporal information. The agents referred to can be
> different types, e.g. human, software or organization, so they could also
> have operated in parallel, so attaching significance to the order of
> multi-item attribute values cannot be done unambiguously.
>
> Ordering is however handled comprehensively in the W3C PROV model which we
> point from the standoff provenance data category. I propose to add a note
> to this data category that if timing of agent activity is a factor the
> standoff provenance should be used instead of this one.
>

If this is relevant for processing the non standoff version of "agent", we
rather spell this out here too, as normative text. Also, since the W3C
provenance work is still not finished, we need pointers to the prov spec in
the version you want to make use of. Think about implementors who read the
agent section in ITS 2.0: they need these pointers to make use of that.


>
>
>  A few more questions about the URIs for in the "transAgentRef" and
> "transRevisionAgentRef" attributes:
>
>  1) Do you say anything about the type of information to be expected,
> e.g.. machine readable or human readable information? E.g. for "locnote" we
> focus on examples with human readable information, also in the "ref"
> attributes; but in your examples you have the "mailto" scheme. How can an
> application know what is expected here, or do you have "best practices"
> what kind of machine readable information should be provided?
>
>
> Yes, there we can suggest best practice on this. Briefly it depends on the
> context. Most often if the data category is used in relation to a
> commercial localization contract, the mode of agent identification should
> be defined there. It could be just a string from a value set specific to
> the contract, it could be mailto if the priority is to improve localization
> team communication, it could be a link to something like a vcard if the
> range of communication channels.
>
> If it is used on a public setting, e.g. alongside published translation
> or  in a crowdsource translation use case, then the Ref option, with a
> public, dereferenceable IRI should be used. This works well for people and
> organizations. for software, specifically Mt engines, an IRI that employed
> the BCP47 machine translation encoding might make sense - but I've not had
> time to look at that in detail.
>
> Shall i include such a best practice summary in the notes for the data
> category or in an annex?
>

Above is very helpful and I would put that as notes in the main text, like
we do with other data categories (see your text in the domain section).


>
>  2) In the transAgent / transAgentRef attributes, several values are
> possible. But does it really make sense to have a transAgentRef without a
> transAgent? Same for the revision agent. So instead of "at least of the
> following", you could say: at least one of the following: a transAgent
> attribute with an optional transAgentRef, or a transRevisionAgent attribute
> with an optional transRevisionAgentRef attribute.
>
>   My intention was that trans(Revision)Agent / trans(Revision)AgentRef
> are _alternative_ ways of allowing agents to be identified.
>
> So it is definitely _not_ the intention that for example
> transAgentRef provides additional information about an agent already
> identified for the same node using transAgent. The reason for both is that
> while a IRI ref make sense in many cases, there are others where the number
> of agents is small (the mode number of employees in LSPs is between 3 and
> 5) or where an anonymising code is used (LSPs are often nervous of letting
> client know who their translators are).
>
> Does that address your query? If so I will clarify this in the definition
> with a statement that any node should be associated with no more than one
> means of identifying the same translation agent or translation revision
> agent.
>
>  3) I would also locally say that the agent and the "ref" attributes MUST
> appear at the same node, and that the "agent" attribute is mandatory.
> Otherwise, you run into trouble with complex inheritance rules: what is
> overriding what?
>
>
> i think this is address by my response to point 1
>


I think you mean point 2), right? That's addressed, yes.


>
>  3) Another option to make things clearer globally would be two rules
> elements: one <its:transAgentRule>, one <its:revisionAgentRule>, again with
> optional "ref" attributes. That would also more directly reflect the local
> approach.
>
>   I'm leaning toward this anyway for editorial reasons related to fully
> supporting the Ref&Point pattern as Yves suggested. Would this be best as
> two rules under the same data category, or two separate data categories
> even?
>

I think haven two rules under one data category is a better approach.


>
>
>  4) Is the order of the comma separated values in the attributes
> significant, and what happens if a value is missing? In the local example
> you have C3PO as transAgent and these URIs as transAgentRef: mailto:
> locutus@b.org http://www.thecollective.org
> does this mean that both relate to C3PO, or for the 2nd URI, there is just
> no transAgent given? Again, it sounds like making the agent attribute
> mandatory and having the ref attributes optional would lower the number of
> choices and increase interop.
>
>
> see my previous post about the order not being significant. Hopefully my
> clarification for point (1) resolves the issue about the relationship
> between trans(Revision)Agent and trans(Revison)AgentRef
>

That's clearer, thanks a lot.

Felix


>
> thanks for these searching comments, its really helping to clarify my
> thinking and improve the specification.
> Dave
>
>
>  Felix
>
>  2012/7/26 Dave Lewis <dave.lewis@cs.tcd.ie>
>
>>  On 26/07/2012 08:01, Felix Sasaki wrote:
>>
>> P.S.: having just "agent" has of course the drawback that you need more
>> rule elements to express the same information.
>>
>> However, it has the benefit that you can be more specific wrt optionality
>> of attributes: currently, all "agent" related attributes are attributes, so
>> this
>>
>>
>>  You mean all 'attributes are optional' right? Yes, that's a good point.
>> I wasn't sure about the correct formulation for this and just took the lead
>> from the rubyRule where all the attributes are also optional, but you are
>> right this leaves the meaningless option of having no attribute for agent
>> (I'm not sure if the same is a problem for ruby).
>>
>> Would a better formulation would be the following?
>>
>>    - A required *selector*<#138c3459f5d79eb7_138c2c798374e560_138c2b3ceddf80a1_att.selector.attribute.selector>attribute. It contains an XPath expression which selects the nodes to which
>>    this rule applies.
>>    - At least one of the following:
>>       - A *transAgent*<#138c3459f5d79eb7_138c2c798374e560_138c2b3ceddf80a1_att.local.no-ns.attribute.locNoteRef>attribute that contains one or more comma separated strings, each one
>>       identifying a different translation agent.
>>       - A *transAgentRef*<#138c3459f5d79eb7_138c2c798374e560_138c2b3ceddf80a1_att.local.no-ns.attribute.locNoteRef>attribute that contains one or more space-separated IRI, each referring to
>>       a resource that identifies a different translation agent.
>>       - A *transRevisionAgent*<#138c3459f5d79eb7_138c2c798374e560_138c2b3ceddf80a1_att.local.no-ns.attribute.locNoteRef>attribute that contains one or more comma separated strings, each one
>>       identifying a different translation revision agent.
>>       - A *transRevisionAgentRef*<#138c3459f5d79eb7_138c2c798374e560_138c2b3ceddf80a1_att.local.no-ns.attribute.locNoteRef>attribute that contains one or more space-separated IRI, each referring to
>>       a resource that identifies a different translation revision agent.
>>
>>
>>  <its:agentRule selector="/html/body/par"/>
>> would be legal, but doesn't make sense. If you have just the "agent"
>> attribute and "agentRef", you can say that both (or just the former?) are
>> mandatory - also the "agentType" attribute.
>>
>>  Felix
>>
>>
>>  cheers,
>> Dave
>>
>>
>>
>>  2012/7/26 Felix Sasaki <fsasaki@w3.org>
>>
>>> Hi Dave, all,
>>>
>>>  About
>>>
>>>  "Two types of Translation Provenance Agent data categories are needed
>>> to identify:"
>>>
>>>  and the data category in general: wouldn't it be possible to have just
>>> two attributes "agent" and "agentRef", and an additional one "type" with
>>> the values "transAgent" or "revisionAgent"? In that they there are less
>>> attributes and also less pointer attributes (see Yves' comment). It would
>>> look like this I think:
>>>
>>>  <its:agentRule selector="/html/body/par" its:agentRef="
>>> http://www.onlinemtexample.com/2012/7/25/legal-v1/wsdl/"
>>>  type="transAgent" />
>>>
>>>
>>>   <its:agentRule selector="/html/body/par" agent="John Doe,
>>> acme-CAT-v2.3" type="revisionAgent"/>
>>>
>>>
>>>
>>> Small editorial thing: your examples above said "its:domainRule", I
>>> changed that to "agentRule".
>>>
>>>
>>>  Another note: in ITS global rules, we always used attributes without a
>>> namespace, e.g. "agents" instead of "its:agents".
>>>
>>>
>>>
>>> Felix
>>>
>>>
>>>
>>> 2012/7/25 Dave Lewis <dave.lewis@cs.tcd.ie>
>>>
>>>> Hi all,
>>>> Given the implementation commitment to provenance and the previous
>>>> posting on this subject,
>>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0161.htmlplease find attached the proposed specification for the Translation
>>>> Provenance Agent plus the example files.
>>>>
>>>> As a reminder, and as discussed in the original post and mentioned at
>>>> the last WG call, provenance covers two essentially independent approaches:
>>>> agent provenance, (which is this one), and standoff provenance, which we
>>>> are treating as two individual data categories. I will send on the standoff
>>>> provenance call for concensus shortly.
>>>>
>>>> Regards,
>>>> Dave
>>>>
>>>>
>>>>
>>>
>>>
>>>   --
>>> Felix Sasaki
>>> DFKI / W3C Fellow
>>>
>>>
>>
>>
>>  --
>> Felix Sasaki
>> DFKI / W3C Fellow
>>
>>
>>
>
>
>  --
> Felix Sasaki
> DFKI / W3C Fellow
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow
Received on Thursday, 26 July 2012 12:47:58 UTC