Re: Relation between MQM and ITS localization quality - rethought from Dave Lewis on 2014-06-10 (public-i18n-its-ig@w3.org from June 2014)

From: Dave Lewis <dave.lewis@cs.tcd.ie>
Date: Tue, 10 Jun 2014 13:14:57 +0100
To: Felix Sasaki <fsasaki@w3.org>, Arle Lommel <arle.lommel@dfki.de>
CC: public-i18n-its-ig@w3.org, Hans Uszkoreit <hans.uszkoreit@dfki.de>, Aljoscha Burchardt <aljoscha.burchardt@dfki.de>, Kim Harris <kim_harris@textform.com>, Alan Melby <alan.melby@gmail.com>, "Dr. David Filip" <David.Filip@ul.ie>, Yves Savourel <ysavourel@enlaso.com>, Phil Ritchie <philr@vistatec.ie>
Message-ID: <5396F6C1.6060102@cs.tcd.ie>
Hi Arle, all,
a few points on this very interesting thread:

1) +1 on Felix's question about where this work continues. The loss of 
work and resources from the web and EC project wrap up, people move on, 
domain names lapse etc was a big issue at LREC the week before last. 
Linking this to ongoing work at OASIS or W3C helps both with consensus 
building and with persistent archiving of output.

2) its great that ITS2.0 is recommended as the transport mechanism for MQM:
http://www.qt21.eu/mqm-definition/definition-2014-06-04.html#mqm_markup
and you have a mapping defined:
http://www.qt21.eu/mqm-definition/definition-2014-06-04.html#relationship-its
but I'm still unclear of the role of the mqm namespace tags for 
annotation. In the example you have in section 7.2 the mqm data 
attributes seem to convey the same info at the ITS LQI ones, especially 
if you used its:locQualityProfileRef to identify the MQM typers as you 
suggest and if the severity mapping is defined. While I can see the 
duplication is helpful from a flexibility point of view, it is a barrier 
to interoperability.  Perhaps the ITS IG could help with further ITS 
examples that could also be used for testing?

3) Had you considered using RDF for the MQM declaration language, rather 
than the XML format you propose. Some advantages I see are:

  * owl subclass give you a more formal method for defining new metrics
    in terms of existing ones, including defining composite metrics
    through multiple superclasses (though you might want to avoid that
    complexity).
  * It can handle display labels nicely, including in multiple languages
    (as already used in vocabularies like DCAT).
  * There's existing support for versioning elements of definitions - so
    you don't have to build that from the ground up.
  * There's good support for documenting mapping, starting with
    owl:sameas to handle the mappings to SAE J2450, DQF etc.
  * We are working with lots of linked data experts in the LD4LT and
    BPMLOD community group who could advise on best practice.

Let me know what you think?

cheers,
Dave


On 06/06/2014 08:25, Felix Sasaki wrote:
> Hi Arle and all,
>
> thanks a lot for the clarification, Arle, that helps me and others a 
> lot I’m sure. One concrete proposal about two points you make:
>
> „ITS 2.0 is the mechanism for supporting blind interchange of quality 
> data and for making MQM data accessible to processes that may not be 
> MQM-aware.“
> „I think Ocelot would be a prime candidate for MQM, actually, while 
> the benefit for CheckMate is less clear.“
>
> Ocelot currently is not MQM aware. There is a workflow to use Okapi 
> Rainbow (not checkmate) to generate ITS2 localization quality issue 
> metadata and then review it in Ocelot. See e.g. the XLIFF that is 
> generated via Rainbow on slide 41 of
> http://www.w3.org/International/its/wiki/images/7/76/Icu37-sasaki-lieske.pdf
> can be used out of the box for quality issue review and XLIFF editing 
> in Ocelot. It would now be great to implement the same workflow 
> (generating in Okapi (Rainbow) and using in Ocelot) for MQM to see how 
> that relates to ITS.
>
> That is an implementation driven approach and it could help a lot to 
> get things clear (e.g. Ocelot currently visualizes ITS2 LQI metadata - 
> if MQM information is available would it visualize both, just one, 
> etc.) both for uses and implementers. If you think this makes sense 
> let me know, I’m happy to help.
>
> A side note: at the FEISGILTT localization world event this week the 
> XLIFF TC discussed future feature of XLIFF. 2.0 is done, so what comes 
> next? ITS2 is a hot candidate, but the XLIFF TC discussed at the 
> FEISGILTT event in public to use the following strategy: only feature 
> proposals that prove interoperable implementation support will be 
> adopted in a new version of XLIFF. That is another argument to try to 
> make these things clear by working on implementations.
>
> Finally a question which is probably revenant for many on this list: 
> is there a way to follow the MQM development, e.g. a public list 
> somewhere to subscribe to or a wiki to contribute to? In that way 
> were’d be no question about what the latest state about the relation 
> to ITS2 etc. is, people could just have a look and comment. (I saw 
> that the MQM draft says „send feedback to info@qt21.eu 
> <mailto:info@qt21.eu>“, so my question is really about a way to 
> subscribe to a list / be always up to date / follow previous 
> discussions etc.)
>
> Best,
>
> Felix
>
> Am 05.06.2014 um 09:36 schrieb Arle Lommel <arle.lommel@dfki.de 
> <mailto:arle.lommel@dfki.de>>:
>
>> Hi all,
>>
>> I had an exchange with Felix in the past 24 hours in light of 
>> discussion about MQM and the ITS localization quality issue types. 
>> Please note that the views expressed here are my personal ones and 
>> are subject to discussion and debate, so please don’t take these 
>> statements as unalterable dicta from on high (as if any of you who 
>> know me would ever do that…).
>>
>> The exchange with Felix has led me to to reevaluate some statements I 
>> made about a year ago, here:
>>
>> http://lists.w3.org/Archives/Public/public-i18n-its-ig/2013Jul/0001.html
>>
>> At that time what we said was that MQM would be a superset of the ITS 
>> 2.0 issue types and that a future version of ITS might want to adopt 
>> MQM as a replacement for/extension of the current ITS list of issue 
>> types. In other words, MQM could be seen as the ITS 3.x (or something 
>> like that) set of localization quality issue types. There was at 
>> least some degree of interest from Yves and Phil, among others, in 
>> possibly implementing MQM as an extension to ITS.
>>
>> However, I now realize that not only is MQM not a simple replacement 
>> of the ITS issue types, it would be a mistake to try to make it into 
>> one. They stand in a clear (and close) relationship to one another, 
>> but there are some important differences. I’ll try to outline some of 
>> these below:
>>
>>   * ITS is really a “blind” interchange mechanism with a fairly low
>>     level of granularity. Most quality assessment metrics will need
>>     to abstract their categories somewhat to work with ITS. For
>>     example, ITS has the category /mistranslation/, but many systems
>>     distinguish between three or four kinds of mistranslations, so
>>     when they export to ITS they give up some information in order to
>>     make their information exchangeable with other systems. ITS is
>>     thus a single set of issues to be used as a whole for blind
>>     interchange purposes. If you work with ITS 2.0 you know what
>>     issues you can encounter and you won’t see different sets each time.
>>   * MQM, by contrast, is a declarative mechanism (with support for
>>     custom extensions) that relies on a shared vocabulary. It is
>>     intended to be used in subsets rather than as a monolithic set of
>>     issue types. As a result MQM does /not/ support blind
>>     interchange. When two systems implement different MQM-based
>>     metrics, their results are /comparable/ (in the sense that they
>>     can be compared with each other and the similarities and
>>     differences are clear), but not /interchangeable /(i.e., you
>>     cannot interpret the relevance of an atomic MQM category without
>>     knowing what metric it is used in).
>>   * Another way of putting the two points is that MQM provide a way
>>     to define an arbitrary translation quality assessment metric,
>>     including various levels of abstraction, while ITS is a mechanism
>>     that abstracts away from arbitrary metrics.
>>   * While, in principle, MQM could be used for blind interchange by
>>     using the full MQM issue set, such blind interchange would not be
>>     useful since it would still leave the problem of what to do with
>>     /different/ metrics that are not isomorphous. It is thus a very
>>     poor mechanism for blind interchange. ITS does not have this
>>     problem because it is an invariant set at a higher level of
>>     abstraction.
>>   * MQM actually has no theoretically privileged perspective with ITS
>>     2.0, but ITS 2.0 has a privileged perspective with MQM. In other
>>     words, from the ITS perspective, MQM-compliant metrics are just
>>     one of many possible sources that can generate ITS markup.
>>     However, from the MQM perspective, ITS 2.0 is /the/ mechanism for
>>     supporting blind interchange of quality data and for making MQM
>>     data accessible to processes that may not be MQM-aware.
>>   * MQM does many things that ITS does not, such as providing ways to
>>     define metrics, selection criteria, extension points, specific
>>     MQM additive inline markup, and additional quality methods (e.g.,
>>     holistic approaches). These things should NOT be added to ITS
>>     2.0, because they would make it more complex and work against the
>>     goal of providing a simple and robust method for declaring
>>     high-level quality issue types.
>>
>>
>> While this may sound like I am advocating a complete divorce of MQM 
>> and ITS, I am not. These things stand in a relationship to each 
>> other, and to the extent that they can be cleanly mapped between each 
>> other and share common names and other items, they should. If ITS 2.0 
>> is updated in the future, MQM should make sure that it reflects those 
>> changes. Similarly, an update to ITS 2.0 might want to add some items 
>> since many MQM issues currently map to /other/ in ITS: if those 
>> issues are deemed to be important for broader interchange purposes, 
>> ITS 2.0 might grow a bit. So, even though I now do not see MQM and 
>> ITS becoming one thing, I would expect them to continue to inform 
>> each other and for there to be collaboration in the future if ITS is 
>> revised.
>>
>> I’ve tried to make some of this relationship clear here:
>>
>> http://www.qt21.eu/mqm-definition/definition-2014-06-04.html#relationship-its
>>
>> That link also provides normative mappings between ITS and MQM. (They 
>> would be normative only for MQM tools, not for ITS tools, so there is 
>> no assumption that an ITS-compliant tool would need to add support 
>> for MQM.) Using those mappings would ensure that MQM-compliant tools 
>> support ITS consistently and predictably.
>>
>> So where does that leave tools like Yves’ and Phil’s? Should they 
>> support MQM as well as ITS? I would argue that they might want to 
>> because MQM provides a /different/ (and important) functionality from 
>> ITS. I think Ocelot would be a prime candidate for MQM, actually, 
>> while the benefit for CheckMate is less clear. I’d like to discuss 
>> that topic, but not in this email (which is already too long). So 
>> I’ll save that for later.
>>
>> I look forward to feedback. If Felix wants to jump in with anything 
>> else from our discussion that he thinks should be added, I’d 
>> encourage him to do so.
>>
>> Best,
>>
>> Arle
>
Received on Tuesday, 10 June 2014 12:10:48 UTC