[ISSUE-34] Draft of quality section from Arle Lommel on 2012-08-06 (public-multilingualweb-lt@w3.org from August 2012)

From: Arle Lommel <arle.lommel@dfki.de>
Date: Mon, 6 Aug 2012 19:01:00 +0200
To: Multilingual Web LT Public List <public-multilingualweb-lt@w3.org>
Message-Id: <D5579732-0C83-40B5-95E0-45A921B0BB39@dfki.de>
Hi all,

One question, Felix. I note you don't like the idea of “subcategories,” but we need a way to refer to these pieces of the whole. If the whole is locQuality (and it is the data category), then what are the individual bits (locQualityNote, locQualityType, etc.) called? If they are the data categories, then we run into trouble since we say we want to make some parts mandatory for implementation of the whole, but we state that it is sufficient to implement one data category. So if I implement locQualitySeverity alone I would be compliant, but that is a pointless implementation and conformance. So I guess the question is what to call these things. Were ITS a localization quality specification I would not hesitate to call these bits its data categories, but that won't work here. Any ideas? 

See below for a few other responses:

> Is below re-formulation correct?
> "It is intended to provide information about quality issues for human reviewers as well as information for automatic consumption that may be used in a localization workflow to provide input to workflow decisions."

Seems reasonable.

> If so, could you provide an example of a human review quality issue and automatic consumption? One for each would be sufficient.

Yes. That should be simple. As examples I could use a traditional localization review cycle where a human reviewer uses a tool to mark up issues which go back to the translator for fixing (human process) and another where a workflow aggregates the data generated by an automatic checking tool and decides automatically whether to pass the translation on or send it on for human review.

>  Because of the relative complexity of localization quality assurance tasks, this data category consists of a number of subcategories, as described in the table presented below. These are designed to work in tandem with each other, but not all categories will be used in all situations.
> 
> I would suggest not to introduce new sub categories. That is a concept that we don't have for any other data category. And it has also implications on conformance that I disagree with.
> 
> Currently, implementations of ITS can claim conformance very easily: "I am implementing data categorie(s) x,y,z; x locally, y globally, z both." I would like to keep this as simple. One reason is also the scenario Des mentioned in Dublin: in complex workflows, we want to be able to express ITS metadata capabilities. With the current conformance model that is easy to do (basically have a machine-readable version of column two and three of this table
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#datacategories-defaults-etc  
> with the conformance proposal below, we get a lot fo ifs and if not. I want to avoid that.
> 
> One other comment on loc-quality-type: I think it would be great to have the mapping of various tool outputs to these types available, or at least an assessment what can be mapped and what not. Arle and I will meet with one tool developer about this soon; Arle, can you or somebody else contact the others we had discussed too?

Yes, now that we can point them to the draft I posted, we can do this. I'll work on this tomorrow, reaching out to Kilgray, Yamagata, ApSIC, beo, and some others.

> Having this assessment and if possible mapping is really crucial before we can finalize this data category IMO. It will also help us a lot with producing test cases.
> 
> Best,
> 
> Felix
>  
>  
> Conformance NOTE: Tools implementing this data category MUST support loc-quality-profile AND either loc-quality-type OR loc-quality-comment. Only in the case of manual human annotation for quality issues may loc-quality-profile be omitted. In addition, if a tool produces internal codes, it is strongly recommended that they support loc-quality-code as well. The loc-quality-score and loc-quality-severity subcategories are optional and are used by tools that support these features.
> 
> 
> I disagree with above conformance note, see above.
> 
> I think what you may want to say, is that if the data category is used, the following is mandatory: ..., and the following is optional: ..., or the following ... are exclusive. We do that e.g. for the terminology data category, see 
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#terminology-implementation
> "GLOBAL" is talking about "exactly one of the following: ...", and "LOCAL" about an optional termInfoRef attribute. 
> 
> Nevertheless, an implementation of terminology has to be able to produce or consume either all global or all local information.
> 
> Can you re-work your proposal along these lines?

Let's discuss tomorrow as I'm not entirely sure how your proposal is different than what I did. But I'm a bit tired at the moment and maybe upon rereading it will be clear. So best to look at it fresh in the morning.

I think that covers your points.

-Arle
Received on Monday, 6 August 2012 17:01:44 UTC