W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > June 2013

Re: Update to MQM documentation and one question

From: Arle Lommel <arle.lommel@dfki.de>
Date: Fri, 28 Jun 2013 16:11:55 +0200
Cc: Christian Lieske <christian.lieske@sap.com>, public-i18n-its-ig@w3.org, Multilingual Web LT Public List Public List <public-multilingualweb-lt@w3.org>, Aljoscha Burchardt <aljoscha.burchardt@dfki.de>, Felix Sasaki <felix.sasaki@googlemail.com>
Message-Id: <9911D991-7609-45E7-B5F8-3CFBE52AF546@dfki.de>
To: Dave Lewis <dave.lewis@cs.tcd.ie>
Sorry for not responding sooner. I was out for a few days.

Actually, in thinking about your solution #3, since MQM doesn't define one metric, but rather a method for defining metrics, we could do something like this:

<span its:locQualityIssueType="terminology"
      its:locQualityIssueComment="should be &quot;blech&quot;"
      its:locQualityIssueProfileRef="xxx.xx/bobs-metric.mqm#Monolingual_terminology"
>blah</span>

(Here we know that the the ITS 2.0 type is "terminology" and the MQM type is "Monolingual terminology", so the mapping is contained in the markup. It doesn't define the normative mapping, but rather the actual mapping, which I think is more useful.

In this case the ProfileRef would contain something describing the metric used for the task, with an anchored pointer to the specific issue in that description. If we take this approach we could dispense with having any proprietary inline markup at all for MQM and just use ITS markup, in which case MQM becomes an implementation of ITS 2.0 and the problematic relationship Felix was worried about goes away. But by pushing the complexity to the ProfileRef, MQM can still do everything it needs to.

In this case the URL would not point to the mapping (which is of limited value since we've already declared that "terminology" is the ITS 2.0 mapping in this markup), but rather to the description of the actual metric in use.

This doesn't address the need for overlapping or noncontiguous spans, but taking this approach separates that need from the need to point to MQM issue types. It also doesn't address how the system is supposed to understand what xxx.xx/bobs-metric.mqm is supposed to contain as a description, but that is a problem for MQM, not ITS.

Are you around next week? It might be good to have a conversation about this with those who are interested. There are enough implications that I want to ensure that we do The Right Thing™.

-Arle

On 2013 Jun 29, at 15:19 , Dave Lewis <dave.lewis@cs.tcd.ie> wrote:

> On 26/06/2013 07:38, Lieske, Christian wrote:
>> Furthermore, a comment about granularity and losslessness might be cool (for all columns, not just MQM). Example: The MQM categories for terminology are more granular than the ones of ITS. Thus, going from MQM to ITS always results in a loss of information. This to a certain degree can for example be mitigated by using the its:locQualityIssueComment as follows its:locQualityIssueComment="MQM original value was 'X'".
> 
> Hi Christian, Arle, others,
> Using the its:locQualityIssueComment to help process the mapping is I think a good practical idea. 
> 
> But could we do this in a more structured way to support the automated processing by tools that implement (i.e. generate and consume) the mapping in ITS?
> 
> For example we could offer best practice to reference with a URL identifying the type of the non-ITS QA issue, MQM specifically in this case. We could manage this via identifiers of the mapping via URLs at the ITS IG wiki, e.g.:
> http://www.w3.org/International/its/wiki/LQItoMQM#terminology-Accuracy.Terminology
> I've put an example page up (we could have similar ones for other mapping).
> 
> We would need then to specify best practice in how to reference this from some LQI annotation. Possibilities I could think of were:
> 
> 1) Reference the type level mapping URL from locQualityIssueProfileRef
> 
> e.g. <span its:locQualityIssueType="terminology"
>                   its:locQualityIssueComment="bad term"
>                   its:locQualityIssueProfileRef=
>                     "http://www.w3.org/International/its/wiki/LQItoMQM#terminologyAccuracy.Terminology">blah</span>
> 
> pros: a (sort of) natural use of locQualityIssueProfileRef
> cons: the dereference document (in this case a fragment) would need to include a reference to the actual QA model, ie. MQM - though that makes sense anyway. Arle this would need MQM document fragment URLs for each type, again a good diea anyway. 
> 
> 2) reference the mapping page from locQualityIssueProfileRef and use a prefix to the value of  locQualityIssueComment with a space separating the start of any actual comment text.
> 
> e.g. <span its:locQualityIssueType="terminology"
>                   its:locQualityIssueComment="terminologyAccuracy.Terminology bad term"
>                   its:locQualityIssueProfileRef=
>                     "http://www.w3.org/International/its/wiki/LQItoMQM#">blah</span>
> 
> cons: need some intelligence to parse comment to understand it contains a mapping reference
> 
> 3) put the whole URL to the type mapping as a prefix of the value of locQualityIssueComment
> 
> e.g. <span its:locQualityIssueType="terminology"
>                   its:locQualityIssueComment="http://www.w3.org/International/its/wiki/LQItoMQM#terminologyAccuracy.Terminology bad term"
>   
> pros: doesn't require use of locQualityIssueProfileRef, which may anyway be superfluous if only one profile used in the document, or allows the MQM doc to be ferferenced directly. Also the parsing of the reference from the comment is a bit more straightforward, though the processor still needs to be 'mapping aware'
> 
> So my own preference amongst these would be for (3), but there may be other better ways to do this.
> 
> any thoughts?
> cheers,
> Dave
> 
Received on Friday, 28 June 2013 14:12:24 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:08:32 UTC