- From: Felix Sasaki <fsasaki@w3.org>
- Date: Thu, 23 Aug 2012 13:45:22 +0200
- To: Tadej Stajner <tadej.stajner@ijs.si>
- Cc: public-multilingualweb-lt@w3.org
- Message-ID: <CAL58czqGNpFs3jxPMcON4yO-fxhvnr1T=ddkBOGiO-u+mGtBqw@mail.gmail.com>
That is an issue, indeed. I assume that the consolidation will not lead to 100% the same definitions. For mtconfidence and taa, I see more similarity. Felix 2012/8/23 Tadej Stajner <tadej.stajner@ijs.si> > Hi, > following up on the idea of consolidating textAnalysisAnnotation with > something else, like qualityReviewAgent, or provenance: intuitively, it > should fit nicely, but I would point out that textAnalysisAnnotation talks > about other annotations in the document ('this <its:disambiguation> was > produced by that tool') , not the document's content ('the quality of that > translation is good'). In a sense, its meta-metadata. Is this difference in > targets an issue for consolidation? > > -- Tadej > > > On 8/23/2012 9:41 AM, Felix Sasaki wrote: > > Hi Dave, we discussed this on the call during your absence. The general > opinion was that the the information needed for mtconfidence, quality and > disambiguation is very similar and very specific. I had a brief look at the > drafts for the three data categories and came to that conclusion, hence the > issue-42 (drafted before the call). > > The also discussed whether this would be a separate data category, or > whether we need to interrelate data category. Instead of going this path > the rough consensus was that it is OK if the three data categories convey > the same information - we should just try to harmonize the description of > aspects like score or tool identification. Hence my action-194 related to > issue-42, to come up with such a harmonization proposal. I am not sure yet > if I'll get to it before the call. > > Best, > > Felix > > Am Donnerstag, 23. August 2012 schrieb Dave Lewis : > > Felix, >> I've only now got to your post on ISSUE-42 : >> >> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0149.html >> >> I think with the combination of mtconfidence score and translationAgent I >> suggest below is suggesting pretty much the same thing, just arrived at via >> a different route. Was that the direction you were heading in? >> >> The translationReviewAgent could work similarly with quality, and we >> could add a sourceReviewAgent or terminologyReviewAgent, or generalise >> translationReviewAgent or qualityReviewAgent to address >> textAnalysisAnnotation. >> >> One point here is that as different data categories are separably >> conformant, will specifications of how they are used in combination >> essentially have to be non-normative, or would we need a distinct normative >> data category in combination section? >> >> cheers, >> Dave >> >> On 23/08/2012 01:56, Dave Lewis wrote: >> >>> Yves, David, >>> Apologies coming to this thread a bit late. You've already pointed out >>> that the score needs to be mostly local, i.e. per segment as passed to an >>> MT service, while the definition of providers/engine would be more likely >>> global, i.e. the same engine would be used for most segments in a document. >>> We also have distinct use cases where only the score is relevant or where >>> the score and the service is needed. So it seems that two data categories >>> would suite, one for score and one for identifying the engine. >>> >>> We do however already a way of identifying an MT service that has been >>> used on a document or its segments, in the form of translationAgent (see >>> call for concensus >>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0256.html >>> ). >>> >>> I propose therefore that translationAgent used in conjunction with an >>> mtConfidence score data category that had just one score attribute would >>> therefore cover the different use cases while also supporting the existing >>> use cases outlined for translationAgent. >>> >>> Note translationAgent allows multiple agents to be specified, but >>> doesn't concern itself with distinguishing the types of agent, e.g. >>> provider/organisation from software/engine, though both are possible. The >>> form of the ID or the result of dereferencing it is assumed to address >>> this, given the lack of common namign schemes for organsiations or engines. >>> >>> I'd be happy anyway to include the example IDs from mtConfidence engine >>> attribute into translationAgent - as these are sensible ideas, and >>> something we could address more comprehensively as best practice next year. >>> >>> cheers, >>> Dave >>> >>> >>> >>> On 09/08/2012 13:56, Yves Savourel wrote: >>> >>>> The end user who does not understand this MUST NOT be exposed to values >>>>> >coming from mixed engines/producers. >>>>> >In other words it is OK to DISPLAY SCORE ONLY TO THE END USER >>>>> >if you have ensured up the stream that they DO come from the same >>>>> >producer AND engine. >>>>> >Again not sure how to cut this with defaults, as the defaults would >>>>> >collapse filtering. >>>>> >>>> Again all this applies only when you have translations for different >>>> providers/engines for the same text. That only one part of the scenarios. >>>> >>>> In any case, the bottom line is that making a local attribute >>>> presence required or not based on whether a global one is present or not is >>>> not easily implementable. It could be defined in an linked rule file for >>>> example. >>>> >>>> What I think you really try to do is make sure a value is define for >>>> mtProducer and mtEngine. I don't agree that one is always need, but that is >>>> a different topic (as discussed above). But if we decide one is needed, we >>>> can just state that one must be define. It doesn't make sense to me to try >>>> to define how or where it should be defined: the inheritance takes care of >>>> that. >>>> >>> >>> >>> >> >> > > -- Felix Sasaki DFKI / W3C Fellow
Received on Thursday, 23 August 2012 11:45:51 UTC