- From: Dave Lewis <dave.lewis@cs.tcd.ie>
- Date: Wed, 25 Jul 2012 19:00:59 +0100
- To: Multilingual Web LT Public List <public-multilingualweb-lt@w3.org>
- Message-ID: <5010345B.5090108@cs.tcd.ie>
** Apologies - Please ignore previous post which was sent unfinished - here's the completed version. Hi All, One existing input to this topic is Pedro's post; http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0050.html This identified some parameters related to post-editing: * UTS Ratings (Utility, Time and Sentiment) * Utility (relative importance of the functionality of the translated * content). * Delivery Time (speed with which the translation is required). * Sentiment (importance on brand image). * Expiration level. For me these all fell into the class of localization job parameters similar to ISO TS 11669. We had agreed should not fall under ITS2.0 but should be left for now to be driven by Linport and ISO, and perhaps revisited with them once we get into best practice mode next year, as summed up by arle; http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0030.html I dug around for other activities on post-editing. One possible requirement is MT confidence scores, which could be important meta-data to convey from an MT service to a posteditor using a CAT tool. But that probably makes it more of a target for XLIFF than ITS - David, Yves, I'm not sure if that is on the agenda there? One specialization of this is that might impact content markup in a way relevant to ITS is sub-segment confidence scores, which are readily extracted from SMT engines like MOSES (I've see presentations from AsiaOnline that displayed this for posteditors). Again, though that still might be more appropriate for XLIFF, especially the inline markup group? As a general point, there are three new FP7 projects i know of in the broad area of PEMT, addressing issues such as subsegment postediting or how to convey confidence scores to users/crowdsource translators, as well as work in CNGL. See; ACCEPT: http://www.accept.unige.ch/index.html CASMECAT: http://www.casmacat.eu/ MATECAT: http://www.matecat.com/matecat/the-project/ This indicates to me this is still an area of active research and therefore not stable enough to consider as ITS requirements: Declan may well have some further views here. On balance therefore i don't see any compelling requirements for fresh postediting data categories specifically coming from the WG. We could possibly add an optional 'confidence-score' attribute to the local translationAgent data category, but I'm not sure we'd have a real use case for this yet since we don't know how to really interpret such a score, apart from as a local confidence differentiator. So i recommend taking no action for now and closing ISSUE-26. cheers, Dave p.s. thanks to Johann Roturier of Symantec for kindly sharing his knowledge in this area
Received on Wednesday, 25 July 2012 18:01:05 UTC