- From: Serge Gladkoff <serge.gladkoff@gmail.com>
- Date: Wed, 11 Jun 2014 04:19:02 -0400
- To: "'Dave Lewis'" <dave.lewis@cs.tcd.ie>, "Felix Sasaki" <fsasaki@w3.org>, "Arle Lommel" <arle.lommel@dfki.de>
- Cc: <public-i18n-its-ig@w3.org>, "Hans Uszkoreit" <hans.uszkoreit@dfki.de>, "Aljoscha Burchardt" <aljoscha.burchardt@dfki.de>, "Kim Harris" <kim_harris@textform.com>, "Alan Melby" <alan.melby@gmail.com>, "Dr. David Filip" <David.Filip@ul.ie>, "Yves Savourel" <ysavourel@enlaso.com>, "Phil Ritchie" <philr@vistatec.ie>, "Renat Bikmatov \(Logrus.Net\)" <renat.bikmatov@logrus.net>, "Andrey Kirpikov \(Logrus.Net\)" <andreyki@logrus.net>
- Message-ID: <005001cf854d$ce19d850$6a4d88f0$@gmail.com>
Dear Colleagues, I am cc-ing my colleagues here for purely information purposes. (Forgive me for very concrete and practical comment here which is NOT a self-marketing pitch, but I this that I MUST mention here that Logrus is engaged in LQA services for quite a number of years now, actually rendering it to several signature clients before ITS/MQM/DQF were even conceived.) Our experience is clear: there can be and should be one universal LQA *methodology* defined on a higher level of abstraction, but every client needs its own *implementation of a specific LQA METHOD*. This means that MQM is a higher level of abstraction - methodology, - but every client will need to define its own implementation of a metrics. We know for sure, that there are NO two clients with the same requirements to the LQA report, LQA categories, etc. For some of them 4 categories is enough, others need 18, etc. BUT - what makes things even worse, - is that ITS lies even lower, it's granular categories that are on -1 level of abstraction. ITS is a fixed number of fields for specific metadata. And what makes the things even more complicated is that ITS fields are made flexible a bit, so there's a number of ways they can be used for specific METRICS. BUT - not flexible enough, there are no fully customizable fields (if I am not mistaken). So it is not quite clear, actually, whether ITS can be used and is enough for any concrete specific client-defined metrics. I hope it is enough, but this should be verified with practical implementation, AND I am sure that there are at least two ways of such implementation with ITS tags. So, speaking about the relationship between MQM and ITS they are on quite different level of abstraction, namely: A. HIGHEST LEVEL 1 = METHODOLOGY: MQM - metrics methodology that allows to define ALL possible client-specific metrics B. CONCRETE METHOD LEVEL 0 = Client-related LQA Metrics Implementation, derived from MQM, but with client-specific requirements, dimensions, thresholds, etc. C. METADATA IMPLEMENTATION LEVEL -1 = Specific implementation of Client-related LQA metrics in ITS metadata tags (and there may be several ways for -1 implementation) It's a bit like network protocols - ITS is a very low level, granular IP protocol so to speak, while LEVEL 0 is Ethernet so to speak, and Level 1 is the network connection on your computer, the highest level of abstraction ("is there a network connection?"). BUT - as a result of this hierarchy - there is NO ONE-TO-ONE relationship between MQM and ITS, and it CANNOT AND SHOULD NOT EXIST. We in Logrus believe that: - ITS should probably be updated with that hierarchy of methodology in mind. ITS should carry information about concrete LEVEL 0 metrics AND LEVEL -1 implementation, otherwise the raw ITS will not carry information about what kind of client-specific metrics it actually implements. If not specified, the ITS metadata data may be interpreted in various ways. I can give you examples, but that will make this email far too long J. - For practical client-related implementation, ITS should go hand in hand with: (a) specific implementation of Client-related MQM-conformant metrics; (b) specific implementation of this Client-related metrics with ITS. Without (a) and (b) there is no way to calculate actual quality rating for specific content even if there is fully defined and loaded ITS metadata, AND THIS IS NOT A BUG, IT'S A FEATURE J. Actual client-related metrics may not even be scalar, it can be three-dimensional, etc., etc. I think that we need concrete implementations here to illustrate all of this, with: (i) fully defined client-specific LQA metrics, which is conformant to MQM; (ii) fully annotated ITS content; (iii) Concrete ITS implementation of the (i) client-specific MQM-conformant LQA metrics; (iv) Software and/or manual worksheets that illustrate and implement all of that. That's who MQM is all-embracing for all clients. Re: ITS call - I will try to participate in about 4 hours, even though I am currently on a "night shift" Philadelphia and may miss the call. Re: implementations - we will definitely have MQM/ITS SIG in GALA CRISP anyway, since the devil is in the details, and what actually matters for the industry are concrete, specific LQA implementations for particular clients and industries. Regards, Serge Gladkoff Logrus International GALA CRISP Lead From: Dave Lewis [mailto:dave.lewis@cs.tcd.ie] Sent: Tuesday, June 10, 2014 8:15 AM To: Felix Sasaki; Arle Lommel Cc: public-i18n-its-ig@w3.org; Hans Uszkoreit; Aljoscha Burchardt; Kim Harris; Alan Melby; Dr. David Filip; Yves Savourel; Phil Ritchie Subject: Re: Relation between MQM and ITS localization quality - rethought Hi Arle, all, a few points on this very interesting thread: 1) +1 on Felix's question about where this work continues. The loss of work and resources from the web and EC project wrap up, people move on, domain names lapse etc was a big issue at LREC the week before last. Linking this to ongoing work at OASIS or W3C helps both with consensus building and with persistent archiving of output. 2) its great that ITS2.0 is recommended as the transport mechanism for MQM: http://www.qt21.eu/mqm-definition/definition-2014-06-04.html#mqm_markup and you have a mapping defined: http://www.qt21.eu/mqm-definition/definition-2014-06-04.html#relationship-it s but I'm still unclear of the role of the mqm namespace tags for annotation. In the example you have in section 7.2 the mqm data attributes seem to convey the same info at the ITS LQI ones, especially if you used its:locQualityProfileRef to identify the MQM typers as you suggest and if the severity mapping is defined. While I can see the duplication is helpful from a flexibility point of view, it is a barrier to interoperability. Perhaps the ITS IG could help with further ITS examples that could also be used for testing? 3) Had you considered using RDF for the MQM declaration language, rather than the XML format you propose. Some advantages I see are: * owl subclass give you a more formal method for defining new metrics in terms of existing ones, including defining composite metrics through multiple superclasses (though you might want to avoid that complexity). * It can handle display labels nicely, including in multiple languages (as already used in vocabularies like DCAT). * There's existing support for versioning elements of definitions - so you don't have to build that from the ground up. * There's good support for documenting mapping, starting with owl:sameas to handle the mappings to SAE J2450, DQF etc. * We are working with lots of linked data experts in the LD4LT and BPMLOD community group who could advise on best practice. Let me know what you think? cheers, Dave On 06/06/2014 08:25, Felix Sasaki wrote: Hi Arle and all, thanks a lot for the clarification, Arle, that helps me and others a lot I'm sure. One concrete proposal about two points you make: "ITS 2.0 is the mechanism for supporting blind interchange of quality data and for making MQM data accessible to processes that may not be MQM-aware." "I think Ocelot would be a prime candidate for MQM, actually, while the benefit for CheckMate is less clear." Ocelot currently is not MQM aware. There is a workflow to use Okapi Rainbow (not checkmate) to generate ITS2 localization quality issue metadata and then review it in Ocelot. See e.g. the XLIFF that is generated via Rainbow on slide 41 of http://www.w3.org/International/its/wiki/images/7/76/Icu37-sasaki-lieske.pdf can be used out of the box for quality issue review and XLIFF editing in Ocelot. It would now be great to implement the same workflow (generating in Okapi (Rainbow) and using in Ocelot) for MQM to see how that relates to ITS. That is an implementation driven approach and it could help a lot to get things clear (e.g. Ocelot currently visualizes ITS2 LQI metadata - if MQM information is available would it visualize both, just one, etc.) both for uses and implementers. If you think this makes sense let me know, I'm happy to help. A side note: at the FEISGILTT localization world event this week the XLIFF TC discussed future feature of XLIFF. 2.0 is done, so what comes next? ITS2 is a hot candidate, but the XLIFF TC discussed at the FEISGILTT event in public to use the following strategy: only feature proposals that prove interoperable implementation support will be adopted in a new version of XLIFF. That is another argument to try to make these things clear by working on implementations. Finally a question which is probably revenant for many on this list: is there a way to follow the MQM development, e.g. a public list somewhere to subscribe to or a wiki to contribute to? In that way were'd be no question about what the latest state about the relation to ITS2 etc. is, people could just have a look and comment. (I saw that the MQM draft says "send feedback to info@qt21.eu", so my question is really about a way to subscribe to a list / be always up to date / follow previous discussions etc.) Best, Felix Am 05.06.2014 um 09:36 schrieb Arle Lommel <arle.lommel@dfki.de>: Hi all, I had an exchange with Felix in the past 24 hours in light of discussion about MQM and the ITS localization quality issue types. Please note that the views expressed here are my personal ones and are subject to discussion and debate, so please don't take these statements as unalterable dicta from on high (as if any of you who know me would ever do that.). The exchange with Felix has led me to to reevaluate some statements I made about a year ago, here: http://lists.w3.org/Archives/Public/public-i18n-its-ig/2013Jul/0001.html At that time what we said was that MQM would be a superset of the ITS 2.0 issue types and that a future version of ITS might want to adopt MQM as a replacement for/extension of the current ITS list of issue types. In other words, MQM could be seen as the ITS 3.x (or something like that) set of localization quality issue types. There was at least some degree of interest from Yves and Phil, among others, in possibly implementing MQM as an extension to ITS. However, I now realize that not only is MQM not a simple replacement of the ITS issue types, it would be a mistake to try to make it into one. They stand in a clear (and close) relationship to one another, but there are some important differences. I'll try to outline some of these below: * ITS is really a "blind" interchange mechanism with a fairly low level of granularity. Most quality assessment metrics will need to abstract their categories somewhat to work with ITS. For example, ITS has the category mistranslation, but many systems distinguish between three or four kinds of mistranslations, so when they export to ITS they give up some information in order to make their information exchangeable with other systems. ITS is thus a single set of issues to be used as a whole for blind interchange purposes. If you work with ITS 2.0 you know what issues you can encounter and you won't see different sets each time. * MQM, by contrast, is a declarative mechanism (with support for custom extensions) that relies on a shared vocabulary. It is intended to be used in subsets rather than as a monolithic set of issue types. As a result MQM does not support blind interchange. When two systems implement different MQM-based metrics, their results are comparable (in the sense that they can be compared with each other and the similarities and differences are clear), but not interchangeable (i.e., you cannot interpret the relevance of an atomic MQM category without knowing what metric it is used in). * Another way of putting the two points is that MQM provide a way to define an arbitrary translation quality assessment metric, including various levels of abstraction, while ITS is a mechanism that abstracts away from arbitrary metrics. * While, in principle, MQM could be used for blind interchange by using the full MQM issue set, such blind interchange would not be useful since it would still leave the problem of what to do with different metrics that are not isomorphous. It is thus a very poor mechanism for blind interchange. ITS does not have this problem because it is an invariant set at a higher level of abstraction. * MQM actually has no theoretically privileged perspective with ITS 2.0, but ITS 2.0 has a privileged perspective with MQM. In other words, from the ITS perspective, MQM-compliant metrics are just one of many possible sources that can generate ITS markup. However, from the MQM perspective, ITS 2.0 is the mechanism for supporting blind interchange of quality data and for making MQM data accessible to processes that may not be MQM-aware. * MQM does many things that ITS does not, such as providing ways to define metrics, selection criteria, extension points, specific MQM additive inline markup, and additional quality methods (e.g., holistic approaches). These things should NOT be added to ITS 2.0, because they would make it more complex and work against the goal of providing a simple and robust method for declaring high-level quality issue types. While this may sound like I am advocating a complete divorce of MQM and ITS, I am not. These things stand in a relationship to each other, and to the extent that they can be cleanly mapped between each other and share common names and other items, they should. If ITS 2.0 is updated in the future, MQM should make sure that it reflects those changes. Similarly, an update to ITS 2.0 might want to add some items since many MQM issues currently map to other in ITS: if those issues are deemed to be important for broader interchange purposes, ITS 2.0 might grow a bit. So, even though I now do not see MQM and ITS becoming one thing, I would expect them to continue to inform each other and for there to be collaboration in the future if ITS is revised. I've tried to make some of this relationship clear here: http://www.qt21.eu/mqm-definition/definition-2014-06-04.html#relationship-it s That link also provides normative mappings between ITS and MQM. (They would be normative only for MQM tools, not for ITS tools, so there is no assumption that an ITS-compliant tool would need to add support for MQM.) Using those mappings would ensure that MQM-compliant tools support ITS consistently and predictably. So where does that leave tools like Yves' and Phil's? Should they support MQM as well as ITS? I would argue that they might want to because MQM provides a different (and important) functionality from ITS. I think Ocelot would be a prime candidate for MQM, actually, while the benefit for CheckMate is less clear. I'd like to discuss that topic, but not in this email (which is already too long). So I'll save that for later. I look forward to feedback. If Felix wants to jump in with anything else from our discussion that he thinks should be added, I'd encourage him to do so. Best, Arle
Received on Wednesday, 11 June 2014 08:19:40 UTC