Re: Draft of quality section

Hi Arle, all,

2012/8/6 Arle Lommel <arle.lommel@dfki.de>

> There is a lot to be done still, but I did want to circulate the current
> draft of the loc-quality section of the document (Yves, you'll notice I
> accepted that suggestion). I'm sure a lot of "scaffolding" still shows in
> this document, but it's getting closer to a working model, so I invite
> feedback on this text. The next step will be to implement suggestions and
> then add it to the actual spec.
> Note to Yves: I have not yet implemented the suggestion for a resolution
> status aspect and I still need to work on the global markup. Felix and I
> also discussed the XLIFF 2.0-style markup you had suggested, and that needs
> more discussion. But at this point, do we think we more or less have the
> appropriate categories and information? Also, if you would check the
> mapping I did from CheckMate to the quality types list and let me know if
> it looks right to you, I would appreciate it.
>
> Thanks,
>
> Arle
>
> 6.x ITS 2.0 Localization Quality
>
> The Localization Quality (loc-quality) data category expresses information
> related to localization quality assessment tasks. It is intended to provide
> guidance about about quality issues for human reviewers as well as
> quantitative information that may be used in a localization workflow to
> make appropriate workflow decisions.
>

Is below re-formulation correct?
"It is intended to provide information about quality issues for human
reviewers as well as information for automatic consumption that may be used
in a localization workflow to provide input to workflow decisions."

If so, could you provide an example of a human review quality issue and
automatic consumption? One for each would be sufficient.


> Because of the relative complexity of localization quality assurance
> tasks, this data category consists of a number of subcategories, as
> described in the table presented below. These are designed to work in
> tandem with each other, but not all categories will be used in all
> situations.
>

I would suggest not to introduce new sub categories. That is a concept that
we don't have for any other data category. And it has also implications on
conformance that I disagree with.

Currently, implementations of ITS can claim conformance very easily: "I am
implementing data categorie(s) x,y,z; x locally, y globally, z both." I
would like to keep this as simple. One reason is also the scenario Des
mentioned in Dublin: in complex workflows, we want to be able to express
ITS metadata capabilities. With the current conformance model that is easy
to do (basically have a machine-readable version of column two and three of
this table
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#datacategories-defaults-etc

with the conformance proposal below, we get a lot fo ifs and if not. I want
to avoid that.

One other comment on loc-quality-type: I think it would be great to have
the mapping of various tool outputs to these types available, or at least
an assessment what can be mapped and what not. Arle and I will meet with
one tool developer about this soon; Arle, can you or somebody else contact
the others we had discussed too?

Having this assessment and if possible mapping is really crucial before we
can finalize this data category IMO. It will also help us a lot with
producing test cases.

Best,

Felix



> Conformance NOTE: Tools implementing this data category MUST support
> loc-quality-profile AND either loc-quality-type OR loc-quality-comment.
> Only in the case of manual human annotation for quality issues may
> loc-quality-profile be omitted. In addition, if a tool produces internal
> codes, it is *strongly* recommended that they support loc-quality-code as
> well. The loc-quality-score and loc-quality-severity subcategories are
> optional and are used by tools that support these features.
>

I disagree with above conformance note, see above.

I think what you may want to say, is that if the data category is used, the
following is mandatory: ..., and the following is optional: ..., or the
following ... are exclusive. We do that e.g. for the terminology data
category, see
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#terminology-implementation
"GLOBAL" is talking about "exactly one of the following: ...", and "LOCAL"
about an optional termInfoRef attribute.

Nevertheless, an implementation of terminology has to be able to produce or
consume either all global or all local information.

Can you re-work your proposal along these lines?


> 6.x.1 Subcategory DescriptionsSubcategory nameDescriptionPermissible
> valuesNotes*loc-quality-profile*Pointer to a description of the quality
> assessment model in use, with a description of the categoriesa
> tool-specific prefix, optionally followed a colon (:) and a URI where the
> description of the tools quality categories is available.The use of a URI
> is *strongly* recommended as it provides a way for human evaluators to
> learn more about the quality categories in use.*loc-quality-score*The
> score value generated by a quality assessment processinteger value from 0
> to 100. Higher values equal better scores.Users would need to normalize
> internal scores to match this system upon generation and convert these
> scores to match their own internal system upon consumption.*
> loc-quality-type*top-level quality type, as defined in the specificationpicklist
> with values drawn from the list of loc-quality-type valuesITS
> 2.0-compliant tools that use these categories would need to map their
> internal values to these types*loc-quality-code*At internal
> classification code for a quality issue as produced by the generating toolqname
> + textThe prefix used in the qname MUST correspond to the tool-specific
> prefix declared in loc-quality-profile. In cases where other ITS
> loc-quality subcategories apply to the same document node, the prefix used
> in this value is assumed to apply to all other subcategories. Therefore, if
> this value does not apply to them, they MUST be applied to a separate
> element (e.g., a separate HTML <span> element). (NB: need an example of
> this.)*loc-quality-comment*A human-readable description of the quality
> issuetextUse of loc-quality-comment would be *strongly* recommended in
> any cases where the value of other is used forloc-quality-type.*
> loc-quality-severity*A numerical value representing the severity of the
> issue, as defined by the model generating the metadatanumber from 0 to 1
> with up to two decimal places, with higher values equaling greater severityIt
> is up to tools to map the numerical values of this to their own system. We
> can provide some informative guidelines for how this is to be done based on
> internal severity systems.Possible attributes (we are less certain on
> these)*loc-quality-stage*A value to indicate the status of a particular
> issue in a review workflowPlain textThe values here are particular to
> specific workflows and may not be interoperable with other workflows*
> loc-quality-threshold*A value which defines apassing score for
> loc-quality-scoreinteger value from 0 to 100.A value of loc-quality-score greater
> than or equal to the value of loc-quality-threshold is deemed to have
> passed the quality assessment process.
> It may make sense to leave this as part of the description referred to in
> loc-quality-profile, but having it here would allow processes to automate
> actions based on whether the file passes or not.*loc-quality-agent*An
> identifier for the agent that produced the quality results??? Perhaps a
> picklist withhuman and machine as values?Needs better definitionLocalization
> Quality Subcategories, Descriptions, and Permissible Values
> 6.x.1.1 Values of loc-quality-type
>
> The loc-quality-type subcategory is intended to provide a *basic* level
> of interoperability between differing localization quality assurance
> systems. It provides a list of 26 high-level quality issue types common in
> automatic and human localization quality assessment. Localization quality
> assessment tools can map their internal categories to these categories in
> order to exchange information about the kinds of issues they identify and
> take intelligent and appropriate action even if another tool does not know
> the *specific* issues identified by the generating tool
>
> NOTE: Tools implementing the loc-quality data category that use the
> loc-quality-code subcategory SHOULD also use the loc-quality-code to
> provide this level of interoperability.
>
> The values listed in the following table are allowed for loc-quality-type.
> Note that tools implementing loc-quality-type are *not* required to check
> for or flag issues for all or any of the types. However, if they implement
> loc-quality-type, the values they produce for the attribute MUST match
> one of the values provided in this table and MUST be semantically accurate.
> If a tool can map its internal values to these categories it MUST do so and
> may not use the value of other, which is reserved strictly for values
> that cannot be mapped to these values.
> Allowable values for loc-quality-typeValueDescriptionExamplesNotes*
> terminology*An incorrect term or a term from the wrong domain was used or
> terms are used inconsistently
>
>    - The localization had Pen Drive when corporate terminology specified
>    thatUSB Stick was to be used; The localization inconsistently used
>    Startand Begin.
>
> Should not be confused with the ITS terminology data category.*
> mistranslation*The content of the target mistranslates the content of the
> source
>
>    - The English source reads “An ape succeeded in grasping a banana
>    lying outside its cage with the help of a stick” but the Italian
>    translation reads “l’ape riuscì a prendere la banana posta tuori dall sua
>    gabbia aiutandosi con un bastone” (“A *bee* succeeded…”)
>
> Issues related to translation of specific terms related to the domain or
> task-specific language should be categorized as terminology issues*
> omission*Necessary text has been omitted from the localization or source
>
>    - One or more segments found in the source that should have been
>    translated are missing in the target
>
> This category should not be used for missing whitespace or formatting
> codes, but instead should be reserved for linguistic content.*untranslated
> *Content that should have been translated was left untranslated
>
>    - The source segment reads “The Professor said to Smith that he would
>    hear from his lawyer” but the Hungarian localization reads “A professzor
>    azt modta Smithnek, hogy he would hear from his lawyer.”
>
> omission take precedence over untranslated. Omissions are distinct in
> that they address cases where text is not present, while untranslated address
> cases where text has been carried from the source untranslated.*addition*The
> translated text contains inappropriate additions
>
>    - The translated text contains a note from the translator to himself
>    to look up a term; the note should have been deleted but was not.
>
> *duplication*Content has been duplicated improperly
>
>    - A section of the target text was inadvertently copied twice in a
>    copy and paste operation.
>
> *inconsistency*The text is inconsistent with itself (NB: not for use with
> terminology inconsistency)
>
>    - The text states that an event happened in 1912 in one location but
>    in another states that it happened in 1812.
>
> *grammar*The text contains a grammatical error (including errors of
> syntax and morphology)
>
>    - The text reads “The guidelines says that users should use a static
>    grounding strap.”
>
> *legal*The text is legally problematic (e.g., it is specific to the wrong
> legal system)
>
>    - The localized text is intended for use in Thailand but includes U.S.
>    regulatory notices.
>    - A text translated into German contains comparative advertising
>    claims that are not allowed by German law
>
> *register*The text is written in the wrong linguistic register of uses
> slang or other language variants inappropriate to the text
>
>    - A financia text translated into U.S. English refers to dollars as
>    “bucks”.
>
> *locale-specific-content*The localization contains content that does not
> apply to the locale for which it was prepared
>
>    - A text translated for the Japanese market contains call center
>    numbers in Texas and refers to special offers available only in the U.S.
>
> Legally inappropriate material should be classified as legal*
> locale-violation*Text violates norms for the intended locale
>
>    - A text localized into German has dates in YYYY-MM-DD format instead
>    of in DD.MM.YYYY
>    - A translated text uses American-style foot and inch measurements
>    instead of centimeters.
>
> *style*The text contains stylistic errors
>
>    - Company style dictates that all individuals be referred to as Mr. or
>    Ms. with a family name, but the text refers to “Jack Smith”.
>
> *characters*The text contains characters that are garbled or incorrect or
> that are not used in the language in which the content appears
>
>    - the text should have a
>    - but instead has a ¥ sign
>    - A text translated into German omits the umlauts over ü, ö, and ä
>    - A Japanese localization contains characters like మ and ఊ (from
>    Telugu)
>
> *misspelling*The text contains a misspelling
>
>    - A German text misspells the word*Zustellung* as *Zustellüng*
>
> *typographical*The text has typographical errors such as
> omitted/incorrect punctuation, incorrect capitalization, etc.
>
>    - An English localization has the following sentence: *The man whom,
>    we saw, was in the Military and carried it’s insignias*
>
> *formatting*The text is formatted incorrectly
>
>    - Warnings in the target text are supposed to be set in italic face,
>    but instead appear in bold face
>    - Margins of the text are narrower than specified
>
> *inconsistent-entities*The source and target text contain different named
> entities (dates, times, place names, individual names, etc.)
>
>    - The name *Thaddeus Cahill* appears in an English source but is
>    rendered as*Tamaš Cahill* in the Czech version
>    - The date February 9, 2007 appears in the source but the translated
>    text has “2. September 2007.”
>
> *numbers*Numbers are inconsistent between source and target
>
>    - The source text states that an object is 120 cm long, but the target
>    text says it is 129 cm. long.
>
> Some tools may correct for differences in units of measurement to reduce
> false positives*markup*There is an issue related to markup or a mismatch
> in markup between source and target
>
>    - The source segment has five markup tags but the target has only two
>    - An opening tag in the localization is missing a closing tag
>
> *pattern-problem*The text fails to match a pattern that defines allowable
> content (or matches one that defines non-allowable content)
>
>    - The quality checking tool disallows the regular expression pattern
>    ['"”’][\.,]but the translated text contains A leading “expert”, a
>    political hack, claimed otherwise.
>
> *whitespace*There is a mismatch in whitespace between source and target
> content
>
>    - A source segment starts with six space characters but the
>    corresponding target segment has two non-breaking spaces at the start.
>
> *internationalization*There is an error related to the
> internationalization of content
>
>    - A line of programming code has embedded language-specific strings
>    - A user interface element leaves no room for text expansion
>    - A form allows only for U.S.-style postal addresses and expects five
>    digit U.S. ZIP codes
>
> There are many kinds of internationalization errors of various types. This
> category is therefore very heterogeneous in what it can refer to.*length*There
> is a significant difference in source and target length
>
>    - The translation of a segment is five times as long as the source
>
> What constitutes a “significant” difference in length is determined by the
> model referred to in theloc-quality-profile*uncategorized*The issue has
> not been categorized
>
>    - A new version of a tool returns information on an issue that has not
>    been previously checked and that is not yet classified
>
> This category has to uses: (1) a tool can use it to pass through quality
> data from another tool in cases where the issues from the other tool are
> not classified (for example, a localization quality assurance tool
> interfaces with a third-party grammar checker); (2) a tool’s issues are not
> yet assigned to categories, and, until an updated assignment is made, they
> may be listed as uncategorized. In the latter case it is recommended that
> issues be assigned to appropriate categories as soon as possible since
> uncategorized does not foster interoperability.*other*Any issue that
> cannot be assigned to any values listed above.This category allows for
> the inclusion of any issues not included in the previously listed values.
> This value MUST not be used for any tool- or model-specific issues that can
> be mapped to the values listed above. In addition, this value is not
> synonymous with uncategorized in that uncategorized issues may be
> assigned to another precise value, while other issues cannot.
>
> If a metric has an “miscellaneous” or “other” category, it should be
> mapped to this value even if the *specific* instance of the issue might
> be mapped to another category.
> Example of localization-quality local markup in HTML5
>
> The following example uses local HTML5 markup with CSS to highlight
> quality issues in browser rendition of the document. It uses a fictional
> DFKI tool in the markup and the markup should not be interpreted as
> referring to an actual quality assurance system.
>
> <!DOCTYPE html>
> <html lang="en">
>   <head>
>     <title>Telharmonium 1897</title>
>     <meta name="its-localization-quality-profile"
>         content="dfki:http://www.dfki.de" />
>     <meta name="its-localization-quality-score" content="56" />
>     <style type="text/css">
>       [its-localization-quality-type]{
>         border:1px solid green;
>         margin:2px;
>       }
>       [its-localization-quality-type = untranslated] {
>         background-color:red;
>       }
>       [its-localization-quality-type = whitespace] {
>         background-color:yellow;
>       }
>       [its-localization-quality-type = inconsistent-entities]{
>         background-color:#9DFFE1;
>       }
>       [its-localization-quality-type = spelling]{
>         background-color:#FFE2F7;
>       }
>       [its-localization-quality-severity = "1.0"]{
>         border: 6px solid red;
>       }
>     </style>
>
>   </head>
>   <body>
>     <h1 id="h0001" its-localization-quality-type="untranslated"
>       its-localization-quality-code="dfki:target_equals_source">
>       Telharmonium (1897)</h1>
>     <p id="p0001">
>       <span class="segment" id="s0001">
>         <span its-localization-quality-type="inconsistent-entities"
>           its-localization-quality-code="dfki:named_entity_not_found"
>           its-localization-quality-note="Should be Thomas Cahill. Why
>             is Batman in the picture?"
>           its-localization-quality-severity="1.0">Christian Bale</span>
>         <span its-localization-quality-type="whitespace"
>           its-localization-quality-code="dfki:extra_space_around_punctuation"
>           its-localization-quality-severity="0.1">(1867 – 1934)</span>
>         conceived of an instrument that could transmit its sound from a power
>         plant for hundreds of miles to listeners over telegraph wiring.</span>
>       <span class="segment" id="s0002">Beginning in 1889 the sound quality of
>         regular telephone concerts was very poor on account of the buzzing
>         generated by carbon-granule microphones. As a result Cahill decided to
>         set a new standard in perfection of sound
>         <span its-localization-quality-type="spelling"
>           its-localization-quality-code="dfki:spelling_error"
>           its-localization-quality-severity="0.5"
>           its-localization-quality-note="should be 'quality'">qulaity</span>
>         with his instrument, a standard that would not only satisfy listeners
>         but that would overcome all the flaws of traditional instruments.</span>
>     </p>
>   </body>
> </html>
>
> And here is the rendered example that shows how the CSS is used to
> highlight issues:
> Telharmonium (1897)
>
> Christian Bale (1867 – 1934) conceived of an instrument that could
> transmit its sound from a power plant for hundreds of miles to listeners
> over telegraph wiring. Beginning in 1889 the sound quality of regular
> telephone concerts was very poor on account of the buzzing generated by
> carbon-granule microphones. As a result Cahill decided to set a new
> standard in perfection of sound qulaity with his instrument, a standard
> that would not only satisfy listeners but that would overcome all the flaws
> of traditional instruments.
> ------------------------------
> Annex X: Mapping of Tool-Specific Quality Codes to loc-quality-type Values
> (Non-Normative)
>
> This Annex is informative.
>
> The following table provides mappings of native quality assurance issue
> codes for a number of common localization quality tools to
> loc-quality-type values. Tool developers are free to map their own issue
> codes to the loc-quality-type values and are encouraged to make their
> mappings publicly available. Tools that produce ITS 2.0 loc-quality markup
> should ensure that the output of their tools matches any publicly available
> mappings they may produce.
>
> Note: These mappings are provides for example only. In the event of
> discrepancy between the mapping published by a developer and this annex,
> the statements from the developer take precedence over this annex.
> loc-quality-typevalueTool/Metric-Specific ValuesCheckMatexliff:docQA
> DistillerSAE J2450LISA QA Model (UI)LISA QA Model (doc) - language only**
> terminology*
>
>    - TERMINOLOGY
>
>
>    - terminology
>
>
>    - Consistency
>    - Tag-aware
>    - ID-aware
>    - Untranslatables
>
>
>    - wrong term
>
>
>    - Terminology
>
>
>    - Glossary adherence
>    - Abbreviations
>    - Context
>
> *mistranslation*
>
>    - Mistranslation
>    - Accuracy
>
>
>    - Semantics
>    - Accuracy
>
> *omission*
>
>    - MISSING_TARGETTU
>    - MISSING_TARGETSET
>    - EMPTY_TARGETSEG
>    - EMTPY_SOURCESEG
>
>
>    - omission
>
>
>    - Empty translations
>
>
>    - omission
>
>
>    - Omissions
>
> *untranslated*
>
>    - TARGET_SAME_AS_SOURCE
>
>
>    - Forgotten translations
>    - Skipped translations
>    - Partial translations
>    - Incomplete translation
>
> *addition*
>
>    - EXTRA_TARGETSEG
>
>
>    - Additions
>
> *duplication*Not addressed in any of these metrics. It may be possible to
> treat this as a case of addition.*inconsistency*
>
>    - inconsistency
>
>
>    - Source
>    - Target
>
>
>    - Consistency
>
> *grammar*
>
>    - syntactic error
>    - word structure or agreement error
>
>
>    - Grammar
>
> *legal*Not addressed in any of these metrics. However, legal compliance
> checking is a big deal for regulated industries and forms a core part of
> their metrics.*register*
>
>    - Register/tone
>    - Language variants/slang
>
> *locale-specific-content*
>
>    - Local suitability
>
> *locale-violation*
>
>    - Country
>
>
>    - Country standards
>
> *style*
>
>    - Style
>
>
>    - General style
>    - Company standards
>
> *characters*
>
>    - ALLOWED_CHARACTERS
>
>
>    - Corrupt characters, source
>    - Corrupt characters, target
>
>
>    - Double/Single Size
>    - Character formatting
>
> *misspelling*
>
>    - misspelling
>
>
>    - Spelling
>
> *typographical*
>
>    - punctuation
>
>
>    - Consecutive punctuation marks
>    - End of segment punctuation
>    - Non-matching pairs (brackets)
>    - Leading bracket outside of TU
>    - Different count(brackets)
>    - Initial capitalization
>    - Entire capitalization
>    - Non-matching pairs (quotation marks)
>    - Incorrect type (quotation marks)
>    - Different count (quotation marks)
>
>
>    - punctuation error
>
>
>    - Punctuation marks
>
> *formatting*
>
>    - TOC
>    - Index
>    - Layout
>    - Typography
>    - Graphics
>    - Call Outs and Captions
>    - Alignment
>    - Sizing
>    - Truncation/overlap
>
> (Numerous)*inconsistent-entities*
>
>    - date
>    - time
>
> *numbers*
>
>    - number
>
>
>    - Number values
>    - Incorrect type (measurements)
>    - Check conversions (measurements)
>
> *markup*
>
>    - MISSING_CODE
>    - EXTRA_CODE
>    - SUSPECT_CODE
>
>
>    - tags
>
> *pattern-problem*
>
>    - UNEXPECTED_PATTERN
>    - SUSPECT_PATTERN
>
>
>    - pattern
>
> *whitespace*
>
>    - MISSING_LEADINGWS
>    - MISSINGORDIFF_LEADINGWS
>    - EXTRA_LEADINGWS
>    - EXTRAORDIFF_LEADINGWS
>    - MISSING_TRAILINGWS
>    - MISSINGORDIFF_TRAILINGWS
>    - EXTRA_TRAILINGWS
>    - EXTRAORDIFF_TRAILINGWS
>
>
>    - Consecutive spaces
>    - Inconsistent leading and trailing spaces
>    - Required/forbidden spaces
>    - Different count (tabs)
>    - Required/forbidden spaces
>
> *internationalization*
>
>    - internationalization*
>
> (*The examples for this code are broader than the type category here.
> Nevertheless, it makes sense to maintain this mapping.)
>
>    - Number formatting
>
> *length*
>
>    - TARGET_LENGTH
>
> *uncategorized*
>
>    - LANGUAGETOOL_ERROR
>
> *other*
>
>    - other
>
>
>    - miscellaneous error
>
>
>    - Hyper text functionality, jumps, popups
>    - Localizable text
>    - Dialogue functionality
>    - Menu functionality
>    - Hotkeys/accelerators
>    - Jumps/links
>
> (* The LISA QA Model documentation addresses numerous issues related to
> software formatting that are outside the scope of the ITS 2.0 loc-quality
> model. For the sake of conciseness and clarity, these are not listed in
> this document.)
>
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow

Received on Monday, 6 August 2012 15:40:55 UTC