Re: [Issue-41][Action-190] Draft a section about mtConfidence, based on the discussion

Hi all,

Some comments inline (difficult to trace, I know, but hopefully they're
clear enough!)...


On 9 August 2012 13:56, Yves Savourel <ysavourel@enlaso.com> wrote:

> > I am not sure what the defaults should be in order to cut
> > this. If multiple mtProducers and/or mtEngines default to the
> > same values, this category collapses as the confidence scores
> > are NOT comparable among producers/engines..
>
> "If multiple..." then multiple mtProducers and mtEngines should define
> their values.
> But when there is a single set of translation there is not necessarily a
> need for more info than the score.
>
>
> > It [the score] is worth nothing if you cannot discern among producers
> and engines.
>
> But it's worth something when you have only one producer or engine. And in
> that case knowing the producer and engine often doesn't matter.
>
>
If you only every have one producer & one engine, then knowing this
information is not essential. However, there are many scenarios where this
information is important and useful. For example, if you wish to translate
the same content using different producers & engines (e.g. for comparison
experiments or for some type of translation validation process) then this
information is important to include.

The way the mtConfidenceScore is calculated is also dependent on the
producer & engine - different MT systems calculate these scores using
different methods.



>  > The end user who does not understand this MUST NOT be exposed to values
> > coming from mixed engines/producers.
> > In other words it is OK to DISPLAY SCORE ONLY TO THE END USER
> > if you have ensured up the stream that they DO come from the same
> > producer AND engine.
> > Again not sure how to cut this with defaults, as the defaults would
> > collapse filtering.
>
> Again all this applies only when you have translations for different
> providers/engines for the same text. That only one part of the scenarios.
>
>
> In any case, the bottom line is that making a local attribute presence
> required or not based on whether a global one is present or not is not
> easily implementable. It could be defined in an linked rule file for
> example.
>
> What I think you really try to do is make sure a value is define for
> mtProducer and mtEngine. I don't agree that one is always need, but that is
> a different topic (as discussed above). But if we decide one is needed, we
> can just state that one must be define. It doesn't make sense to me to try
> to define how or where it should be defined: the inheritance takes care of
> that.
>

It should be a requirement that when mtConfidenceScore is given, a value
must also be given for mtProducer and mtEngine as without it is hard to
interpret what exactly the mtConfidenceScore represents. True, that for an
end-user it may not be of interest how the score was generated, but for
people involved in builing, deploying and integrating MT systems, it is
important.

and on the following:
>
> > I do not understand this part at all. MT candidate translations
> > are always 100% matches in the terms of TM matching.
> > The self-reported confidence expresses what might be the
> > chance that the 100% match is accurate/usable.. I do not
> > think we need a combined value here. And this is also a
> > reason why XLIFF would need a separate mechanism for
> > reporting the confidence, we could not overload the normal
> > match rate..
>
> I guess the point I was making was that Bing doesn't provide 0-100%
> confidence score. So if we use this as an example we should explain how we
> get it. Or use another example.
>

To support David's comment on this, that a single figure is all that should
be used. A percentage may be the most apt scale to use here (most existing
mtConfidence scores used in practice can be represented as a percentage).
How that figure is derived should not be represented by the metadata, in my
opinion (i.e. we should not offer say a list of multiple scores that make
up the mtConfidenceScore). If needed, how this score was derived is an a
characteristic of the MT engine used, which hopefully can be reflected in
the mtProducer and mtEngine attributes.

-- 
Dr. Declan Groves
Research Integration Officer
Centre for Next Generation Localisation (CNGL)
Dublin City University

email: dgroves@computing.dcu.ie <dgroves@computing.dcu.ie>
 phone: +353 (0)1 700 6906

Received on Thursday, 9 August 2012 13:38:18 UTC