Re: Potential for synergies on the implementation and "data" level: MQM and ITS 2.0

Felix, All,

This convergence is very interesting and exciting and FEISGILTT certainly 
seems to be an appropriate rendezvous for such discussions. Yes, I will be 
at Localization World Tuesday to Thursday and would be happy to talk with 
any interested parties.

Phil.





From:   Felix Sasaki <fsasaki@w3.org>
To:     "Arle Lommel (Arle.Lommel@dfki.de)" <Arle.Lommel@dfki.de>, 
Aljoscha Burchardt <aljoscha.burchardt@dfki.de>, kim_harris@textform.com, 
Cc:     public-i18n-its-ig@w3.org
Date:   07/06/2013 20:35
Subject:        Potential for synergies on the implementation and "data" 
level: MQM  and ITS 2.0



Hi Arle, Aljoscha, Kim, with CC to the W3C i18n ITS Interest Group,

there is now a great opportunity to build synergies between the 
QTLaunchPad Multidimensional Quality Metrics (MQM) and ITS 2.0.

Some background for the ITS IG members who don't know MQM: The EC-funded 
QTLaunchPad project is developing a unified and customizable, 
multidimensional framework for translation quality assement built around 
metrics of fluency, accuracy, and end-user adequacy.

Some background for all in this thread: so far the relation between the 
MQM model and ITS 2.0 is rather general, see e.g. the description part 
of the next week Localization World FEISGILTT event
http://www.localizationworld.com/lwlon2013/feisgiltt/accepted.html

"A further point of contact (of MQM) is with the ITS 2.0 specification, 
which provides a mechanism to refer to the quality expectations outlined 
in an STS and to integrate them into a standard, QTLaunchPad-compatible 
mechanism that enables quality to be addressed in any tool that 
implements ITS 2.0’s quality markup. "

By "rather general" I mean that the integration of MQM into the ITS 2.0 
on a detailed, "implementation" level hasn't happend yet. Some 
activities have rather happened in parallel, like:


1) specifying MQM types and ITS 2.0 types
http://www.w3.org/TR/its20/#lqissue-typevalues

As I understand Arle there is an informal mapping (which is in flux), 
but no formal relation has been defined, that is something implementable 
as an automatic conversion. Since MQM is more expressiv than ITS .20 
such a formal mapping for sure would be with information loss, but 
having an exact description of what's lost will be very valuable.

2) specifying a serialization of MQM and of ITS 2.0 quality issue 
markup. ITS 2.0 has a mechanism to serialize one or more localization 
quality isssues for the same span of text, see
http://www.w3.org/TR/its20/#EX-locQualityIssue-local-2

As I understand Arle, for MQM there is the requirement of annotating 
potentially overlapping quality issues - this couldn't be done with ITS 
2.0 markup, that is: one cannot reuse the ITS 2.0 markup for all of MQM 
markup.

3) The ITS 2.0 links to an informal mapping of existing tools to ITS 2.0 
types
http://www.w3.org/International/its/wiki/Tool_specific_mappings

as I understand Arle, MQM is working on a similar mapping, taking 
detailed feedback from LSPs into account.


There might be other areas, if you see them please let me know.

Now, if we resolve 1-3 or at least describe for implementers how MQM and 
ITS 2.0 relate, we can
- avoid confusion by implementers why there are two ways to express 
localization issue information, but just explain the differences in 
detail;
- get implementers actually to implement both MQM and ITS 2.0. ITS 2.0 
quality issue is currently being implemented in three tools
http://www.w3.org/TR/2013/WD-mlw-metadata-us-impl-20130307/#Quality_Check

http://www.w3.org/TR/2013/WD-mlw-metadata-us-impl-20130307/#Harnessing_ITS_2.0_Metadata_to_Improve_the_Human_Review_Process


http://www.languagetool.org/


We may be able to convince the ITS 2.0 implementers to integrate tooling 
for MQM in their tools as well. This would be a big success for both 
efforts.


So I have written this mail to start a conversation, so that we get 
feedback from all stakeholders. In addition and to move this forward, I 
have a concrete suggestion, based on discussions I had with Arle and 
Aljoscha already: AFAKI both MQM and ITS 2.0 will be presented at 
TCWorld this year. We could take this as a milestone for setting the 
relation in stone on an implementation level, and integrate examples in 
presentations vice versa. What do you think?

Btw., two ITS 2.0 localization quality issue implementers, Yves Savourel 
and Phil Ritchie, will be at LocWorld next week too. So you may already 
touch base?


As some input, from the ITS 2.0 side there is this input, summarized 
below:

- Localization Quality Issue definition 
http://www.w3.org/TR/its20/#lqissue

- Normative type values http://www.w3.org/TR/its20/#lqissue-typevalues

- Non normative mappings to tools 
http://www.w3.org/International/its/wiki/Tool_specific_mappings

- ITS 2.0 Localization Quality issue in the ITS 2.0 test suite
** input files 
https://github.com/finnle/ITS-2.0-Testsuite/tree/master/its2.0/inputdata/locqualityissue


** output files 
https://github.com/finnle/ITS-2.0-Testsuite/tree/master/its2.0/expected/locqualityissue


** outpuf files in XLIFF (just informative, not set in stone) 
https://github.com/finnle/ITS-2.0-Testsuite/tree/master/its2.0/xliffsamples/inputdata/locqualityissue

** an output of the XML intput files in RDF, using the RDF "NIF" 
vocabulary
https://github.com/finnle/ITS-2.0-Testsuite/tree/master/its2.0/nif-conversion/expected 


I am mentioning NIF here since it provides a solution to the overlapping 
representation issue that I had mentioned above.

It would now be interesting to see the latest MQM model here and example 
files.

Finally, let me mention that this mailling list is not for the 
development of ITS 2.0 - this is an open list to discuss issues like 
with this mail. And in the next months we will use regular phone calls 
to discuss topics like this.

Best,

Felix



************************************************************
VistaTEC Ltd. Registered in Ireland 268483. 
Registered Office, VistaTEC House, 700, South Circular Road, 
Kilmainham. Dublin 8. Ireland. 

The information contained in this message, including any accompanying 
documents, is confidential and is intended only for the addressee(s). 
The unauthorized use, disclosure, copying, or alteration of this 
message is strictly forbidden. If you have received this message in
error please notify the sender immediately.
************************************************************

Received on Friday, 7 June 2013 21:11:32 UTC