Re: [Issue-41][Action-190] Draft a section about mtConfidence, based on the discussion from Dave Lewis on 2012-08-23 (public-multilingualweb-lt@w3.org from August 2012)

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Thu, 23 Aug 2012 02:54:50 +0100
To: public-multilingualweb-lt@w3.org
Message-ID: <50358D6A.5060708@cs.tcd.ie>
Felix,
I've only now got to your post on ISSUE-42 :
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0149.html

I think with the combination of mtconfidence score and translationAgent 
I suggest below is suggesting pretty much the same thing, just arrived 
at via a different route. Was that the direction you were heading in?

The translationReviewAgent could work similarly with quality, and we 
could add a sourceReviewAgent or terminologyReviewAgent, or generalise 
translationReviewAgent or qualityReviewAgent to address 
textAnalysisAnnotation.

One point here is that as different data categories are separably 
conformant, will specifications of how they are used in combination 
essentially have to be non-normative, or would we need a distinct 
normative data category in combination section?

cheers,
Dave

On 23/08/2012 01:56, Dave Lewis wrote:
> Yves, David,
> Apologies coming to this thread a bit late. You've already pointed out 
> that the score needs to be mostly local, i.e. per segment as passed to 
> an MT service, while the definition of providers/engine would be more 
> likely global, i.e. the same engine would be used for most segments in 
> a document. We also have distinct use cases where only the score is 
> relevant or where the score and the service is needed. So it seems 
> that two data categories would suite, one for score and one for 
> identifying the engine.
>
> We do however already a way of identifying an MT service that has been 
> used on a document or its segments, in the form of translationAgent 
> (see call for concensus 
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0256.html).
>
> I propose therefore that translationAgent used in conjunction with an 
> mtConfidence score data category that had just one score attribute 
> would therefore cover the different use cases while also supporting 
> the existing use cases outlined for translationAgent.
>
> Note translationAgent allows multiple agents to be specified, but 
> doesn't concern itself with distinguishing the types of agent, e.g. 
> provider/organisation from software/engine, though both are possible. 
> The form of the ID or the result of dereferencing it is assumed to 
> address this, given the lack of common namign schemes for 
> organsiations or engines.
>
> I'd be happy anyway to include the example IDs from mtConfidence 
> engine attribute into translationAgent - as these are sensible ideas, 
> and something we could address more comprehensively as best practice 
> next year.
>
> cheers,
> Dave
>
>
>
> On 09/08/2012 13:56, Yves Savourel wrote:
>>> The end user who does not understand this MUST NOT be exposed to values
>>> >coming from mixed engines/producers.
>>> >In other words it is OK to DISPLAY SCORE ONLY TO THE END USER
>>> >if you have ensured up the stream that they DO come from the same
>>> >producer AND engine.
>>> >Again not sure how to cut this with defaults, as the defaults would
>>> >collapse filtering.
>> Again all this applies only when you have translations for different 
>> providers/engines for the same text. That only one part of the 
>> scenarios.
>>
>>   In any case, the bottom line is that making a local attribute 
>> presence required or not based on whether a global one is present or 
>> not is not easily implementable. It could be defined in an linked 
>> rule file for example.
>>
>> What I think you really try to do is make sure a value is define for 
>> mtProducer and mtEngine. I don't agree that one is always need, but 
>> that is a different topic (as discussed above). But if we decide one 
>> is needed, we can just state that one must be define. It doesn't make 
>> sense to me to try to define how or where it should be defined: the 
>> inheritance takes care of that.
>
>
Received on Thursday, 23 August 2012 01:55:17 UTC