Re: [Issue-41, issue-42, Action-190, action-194] Draft a section about mtConfidence, based on the discussion from Tadej Stajner on 2012-08-23 (public-multilingualweb-lt@w3.org from August 2012)

From: Tadej Stajner <tadej.stajner@ijs.si>
Date: Thu, 23 Aug 2012 12:58:33 +0200
To: public-multilingualweb-lt@w3.org
Message-ID: <50360CD9.8000202@ijs.si>
Hi,
following up on the idea of consolidating textAnalysisAnnotation with 
something else, like qualityReviewAgent, or provenance: intuitively, it 
should fit nicely, but I would point out that textAnalysisAnnotation 
talks about other annotations in the document ('this 
<its:disambiguation> was produced by that tool') , not the document's 
content ('the quality of that translation is good'). In a sense, its 
meta-metadata. Is this difference in targets an issue for consolidation?

-- Tadej

On 8/23/2012 9:41 AM, Felix Sasaki wrote:
> Hi Dave, we discussed this on the call during your absence. The 
> general opinion was that the the information needed for mtconfidence, 
> quality and disambiguation is very similar and very specific. I had a 
> brief look at the drafts for the three data categories and came to 
> that conclusion, hence the issue-42 (drafted before the call).
>
> The also discussed whether this would be a separate data category, or 
> whether we need to interrelate data category. Instead of going this 
> path the rough consensus was that it is OK if the three data 
> categories convey the same information - we should just try to 
> harmonize the description of aspects like score or tool 
> identification. Hence my action-194 related to issue-42, to come up 
> with such a harmonization proposal. I am not sure yet if I'll get to 
> it before the call.
>
> Best,
>
> Felix
>
> Am Donnerstag, 23. August 2012 schrieb Dave Lewis :
>
>     Felix,
>     I've only now got to your post on ISSUE-42 :
>     http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0149.html
>
>     I think with the combination of mtconfidence score and
>     translationAgent I suggest below is suggesting pretty much the
>     same thing, just arrived at via a different route. Was that the
>     direction you were heading in?
>
>     The translationReviewAgent could work similarly with quality, and
>     we could add a sourceReviewAgent or terminologyReviewAgent, or
>     generalise translationReviewAgent or qualityReviewAgent to address
>     textAnalysisAnnotation.
>
>     One point here is that as different data categories are separably
>     conformant, will specifications of how they are used in
>     combination essentially have to be non-normative, or would we need
>     a distinct normative data category in combination section?
>
>     cheers,
>     Dave
>
>     On 23/08/2012 01:56, Dave Lewis wrote:
>
>         Yves, David,
>         Apologies coming to this thread a bit late. You've already
>         pointed out that the score needs to be mostly local, i.e. per
>         segment as passed to an MT service, while the definition of
>         providers/engine would be more likely global, i.e. the same
>         engine would be used for most segments in a document. We also
>         have distinct use cases where only the score is relevant or
>         where the score and the service is needed. So it seems that
>         two data categories would suite, one for score and one for
>         identifying the engine.
>
>         We do however already a way of identifying an MT service that
>         has been used on a document or its segments, in the form of
>         translationAgent (see call for concensus
>         http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0256.html).
>
>         I propose therefore that translationAgent used in conjunction
>         with an mtConfidence score data category that had just one
>         score attribute would therefore cover the different use cases
>         while also supporting the existing use cases outlined for
>         translationAgent.
>
>         Note translationAgent allows multiple agents to be specified,
>         but doesn't concern itself with distinguishing the types of
>         agent, e.g. provider/organisation from software/engine, though
>         both are possible. The form of the ID or the result of
>         dereferencing it is assumed to address this, given the lack of
>         common namign schemes for organsiations or engines.
>
>         I'd be happy anyway to include the example IDs from
>         mtConfidence engine attribute into translationAgent - as these
>         are sensible ideas, and something we could address more
>         comprehensively as best practice next year.
>
>         cheers,
>         Dave
>
>
>
>         On 09/08/2012 13:56, Yves Savourel wrote:
>
>                 The end user who does not understand this MUST NOT be
>                 exposed to values
>                 >coming from mixed engines/producers.
>                 >In other words it is OK to DISPLAY SCORE ONLY TO THE
>                 END USER
>                 >if you have ensured up the stream that they DO come
>                 from the same
>                 >producer AND engine.
>                 >Again not sure how to cut this with defaults, as the
>                 defaults would
>                 >collapse filtering.
>
>             Again all this applies only when you have translations for
>             different providers/engines for the same text. That only
>             one part of the scenarios.
>
>               In any case, the bottom line is that making a local
>             attribute presence required or not based on whether a
>             global one is present or not is not easily implementable.
>             It could be defined in an linked rule file for example.
>
>             What I think you really try to do is make sure a value is
>             define for mtProducer and mtEngine. I don't agree that one
>             is always need, but that is a different topic (as
>             discussed above). But if we decide one is needed, we can
>             just state that one must be define. It doesn't make sense
>             to me to try to define how or where it should be defined:
>             the inheritance takes care of that.
>
>
>
>
>
>
Received on Thursday, 23 August 2012 10:59:02 UTC