Relation between MQM and ITS localization quality - rethought

Hi all,

I had an exchange with Felix in the past 24 hours in light of discussion about MQM and the ITS localization quality issue types. Please note that the views expressed here are my personal ones and are subject to discussion and debate, so please don’t take these statements as unalterable dicta from on high (as if any of you who know me would ever do that…).

The exchange with Felix has led me to to reevaluate some statements I made about a year ago, here:

http://lists.w3.org/Archives/Public/public-i18n-its-ig/2013Jul/0001.html

At that time what we said was that MQM would be a superset of the ITS 2.0 issue types and that a future version of ITS might want to adopt MQM as a replacement for/extension of the current ITS list of issue types. In other words, MQM could be seen as the ITS 3.x (or something like that) set of localization quality issue types. There was at least some degree of interest from Yves and Phil, among others, in possibly implementing MQM as an extension to ITS.

However, I now realize that not only is MQM not a simple replacement of the ITS issue types, it would be a mistake to try to make it into one. They stand in a clear (and close) relationship to one another, but there are some important differences. I’ll try to outline some of these below:

ITS is really a “blind” interchange mechanism with a fairly low level of granularity. Most quality assessment metrics will need to abstract their categories somewhat to work with ITS. For example, ITS has the category mistranslation, but many systems distinguish between three or four kinds of mistranslations, so when they export to ITS they give up some information in order to make their information exchangeable with other systems. ITS is thus a single set of issues to be used as a whole for blind interchange purposes. If you work with ITS 2.0 you know what issues you can encounter and you won’t see different sets each time.
MQM, by contrast, is a declarative mechanism (with support for custom extensions) that relies on a shared vocabulary. It is intended to be used in subsets rather than as a monolithic set of issue types. As a result MQM does not support blind interchange. When two systems implement different MQM-based metrics, their results are comparable (in the sense that they can be compared with each other and the similarities and differences are clear), but not interchangeable (i.e., you cannot interpret the relevance of an atomic MQM category without knowing what metric it is used in).
Another way of putting the two points is that MQM provide a way to define an arbitrary translation quality assessment metric, including various levels of abstraction, while ITS is a mechanism that abstracts away from arbitrary metrics.
While, in principle, MQM could be used for blind interchange by using the full MQM issue set, such blind interchange would not be useful since it would still leave the problem of what to do with different metrics that are not isomorphous. It is thus a very poor mechanism for blind interchange. ITS does not have this problem because it is an invariant set at a higher level of abstraction.
MQM actually has no theoretically privileged perspective with ITS 2.0, but ITS 2.0 has a privileged perspective with MQM. In other words, from the ITS perspective, MQM-compliant metrics are just one of many possible sources that can generate ITS markup. However, from the MQM perspective, ITS 2.0 is the mechanism for supporting blind interchange of quality data and for making MQM data accessible to processes that may not be MQM-aware.
MQM does many things that ITS does not, such as providing ways to define metrics, selection criteria, extension points, specific MQM additive inline markup, and additional quality methods (e.g., holistic approaches). These things should NOT be added to ITS 2.0, because they would make it more complex and work against the goal of providing a simple and robust method for declaring high-level quality issue types.

While this may sound like I am advocating a complete divorce of MQM and ITS, I am not. These things stand in a relationship to each other, and to the extent that they can be cleanly mapped between each other and share common names and other items, they should. If ITS 2.0 is updated in the future, MQM should make sure that it reflects those changes. Similarly, an update to ITS 2.0 might want to add some items since many MQM issues currently map to other in ITS: if those issues are deemed to be important for broader interchange purposes, ITS 2.0 might grow a bit. So, even though I now do not see MQM and ITS becoming one thing, I would expect them to continue to inform each other and for there to be collaboration in the future if ITS is revised.

I’ve tried to make some of this relationship clear here:

http://www.qt21.eu/mqm-definition/definition-2014-06-04.html#relationship-its

That link also provides normative mappings between ITS and MQM. (They would be normative only for MQM tools, not for ITS tools, so there is no assumption that an ITS-compliant tool would need to add support for MQM.) Using those mappings would ensure that MQM-compliant tools support ITS consistently and predictably.

So where does that leave tools like Yves’ and Phil’s? Should they support MQM as well as ITS? I would argue that they might want to because MQM provides a different (and important) functionality from ITS. I think Ocelot would be a prime candidate for MQM, actually, while the benefit for CheckMate is less clear. I’d like to discuss that topic, but not in this email (which is already too long). So I’ll save that for later.

I look forward to feedback. If Felix wants to jump in with anything else from our discussion that he thinks should be added, I’d encourage him to do so.

Best,

Arle

Received on Thursday, 5 June 2014 07:37:21 UTC