RE: ISSUE 34: Proposal for quality attributes

Hi Arle

Just one point of clarification on this:
>Before I move on, if we really move into the HTML5 world (per Felix's mail), we can no longer call these things like "its:qualityscore"

I interpreted Felix' Email regarding  HTML WG to mean that we require any attributes we define to be ratified by that group. Not that we cannot use the 'its-' prefix.   As you illustrate with your suggestions, not having a 'namespace' prefix like 'its-' would require attribute names to take the form of cryptic conflated abbreviations  which would definitely impact the t the readability of attributes and thus, I imagine, generate some push back from the HTML folks.

Perhaps you or Felix could clarify.

Thanks
Des


From: Arle Lommel [mailto:arle.lommel@dfki.de]
Sent: 02 August 2012 11:03
To: Multilingual Web LT Public List
Subject: ISSUE 34: Proposal for quality attributes

Hi all,

In order to move things more to the public list ad get something more like a real proposal around which to build consensus, I am sending a new proposal for the quality attributes. This has not been reviewed by Phil or Yves (who have worked on earlier internal versions), so all errors are mine, not theirs. Phil and Yves, feel free to jump in with any corrections if I have misrepresented any of our discussion.

Before I move on, if we really move into the HTML5 world (per Felix's mail), we can no longer call these things like "its:qualityscore" because translation has no particular claim to the term quality and it will confuse potential users in HTML5 if we use quality as the primary category part of the name. As a result I propose that we use names like "transqualscore" or "tqscore" for the data categories. Otherwise I anticipate some real, and justifiable, complaints from the HTML5 working group.

 Note that these complaints would not have arisen with something like "its-qualityscore" because the ITS portion made the scope clear, but if we lose that, things change.

As a result I will use "tq" in the names, but this, like everything else, is open to discussion.



INVENTORY AND DESCRIPTION OF ATTRIBUTES

So here is the inventory of attributes for in the quality model. Note that I split tqtype and tqcode apart as per my earlier message about making selections easier based on the attribute value.

Attribute name

Description

Permissible values

Notes

Definite attributes (these ones are well established)

tqprofile

Pointer to a description of the quality assessment model in use, with a description of the categories

URI

Potentially we might need a way to map a q name used elsewhere to a specific profile, but discussion between Arle, Yves, and Phil felt that a single tqprofile per document was probably sufficient

tqscore

The score value generated by a quality assessment process

integer value from 0 to 100. Higher values equal better scores.

Users would need to normalize internal scores to match this system upon generation and convert these scores to match their own internal system upon consumption.

tqtype

top-level quality type, as defined in the specification

picklist value (see previous mails)

ITS 2.0-compliant tools that use these categories would need to map their internal values to these types

tqcode

At internal classification code for a quality issue as produced by the generating tool

qname + text

If we only allow one profile for document, we might be able to drop the qname portion and infer the data based on the tqprofile attribute.

tqcomment

A human-readable description of the quality issue

text

Use of tqcomment would be stronglyrecommended in any cases where the value ofother is used for tqtype.

tqseverity

A numerical value representing the severity of the issue, as defined by the model generating the metadata

number from 0 to 1 with up to two decimal places, with higher values equaling greater severity

It is up to tools to map the numerical values of this to their own system. We can provide some informative guidelines for how this is to be done based on internal severity systems.

Potential attributes (we are less certain on these)

tqstage

A value to indicate the status of a particular issue in a review workflow

Picklist, consisting of:
translated|
reviewed|
rebuttal|
agreed

The precise meaning of these values remains to be defined

tqthreshold

A value which defines apassing score fortqscore

integer value from 0 to 100.

A value of tqscore greater than or equal to the value of tqthreshold is deemed to havepassed the quality assessment process.
It may make sense to leave this as part of the description referred to in tqprofile, but having it here would allow processes to automate actions based on whether the file passes or not.

tqagent

An identifier for the agent that produced the quality results

??? Perhaps a picklist with human and machine as values

Needs better definition



SAMPLE FILE

And here is a sample HTML5 file that uses the "definite" attributes as local attributes, along with a rendering that uses CSS to show what a simple tool could do:

The values of tqprofile, tqscore, and tqcode are simply made up for the purpose of illustration. There is no "real" metric behind these.

<!DOCTYPE html>
<html lang="en">
                <head>
                                <title>Telharmonium 1897</title>
                                <meta name="tqprofile" content="http://www.dfki.de" />
                                <meta name="tqscore" content="56" />
                                <style type="text/css">
                                                [tqtype]{
                                                                border:1px solid green;
                                                                margin:2px;
                                                }
                                                [tqtype = untranslated]{
                                                                background-color:red;
                                                }
                                                [tqtype = whitespace]{
                                                                background-color:yellow;
                                                }
                                                [tqtype = inconsistent-entities]{
                                                                background-color:#9DFFE1;
                                                }
                                                [tqtype = spelling]{
                                                                background-color:#FFE2F7;
                                    [tqseverity = "1.0"]{
                                                border: 6px solid red;
                                                }
                                </style>
                </head>
                <body>
                                <h1 id="h0001" tqtype="untranslated" tqcode="dfki:target_equals_source"> Telharmonium (1897)</h1>
                                <p id="p0001">
                                                <span class="segment" id="s0001"><span tqtype="inconsistent-entities" tqcode="dfki:named_entity_not_found"
                                                                                tqnote="Should be Thomas Cahill. Why is Batman in the picture?" tqseverity="1.0">Christian
                                                                                Bale</span>
                                                                <span tqtype="whitespace" tqcode="dfki:extra_space_around_punctuation" tqseverity="0.1">(1867 -
                                                                                1934)</span> conceived of an instrument that could transmit its sound from a power plant for
                                                                hundreds of miles to listeners over telegraph wiring.</span>
                                                <span class="segment" id="s0002">Beginning in 1889 the sound quality of regular telephone concerts was very
                                                                poor on account of the buzzing generated by carbon-granule microphones. As a result Cahill decided to
                                                                set a new standard in perfection of sound <span tqtype="spelling" tqcode="dfki:spelling_error"
                                                                                tqseverity="0.5" tqnote="should be 'quality'">qulaity</span> with his instrument, a standard that
                                                                would not only satisfy listeners but that would overcome all the flaws of traditional
                                                                instruments.</span>
                                </p>
                </body>
</html>

Telharmonium (1897)

Christian Bale (1867 - 1934) conceived of an instrument that could transmit its sound from a power plant for hundreds of miles to listeners over telegraph wiring. Beginning in 1889 the sound quality of regular telephone concerts was very poor on account of the buzzing generated by carbon-granule microphones. As a result Cahill decided to set a new standard in perfection of sound qulaity with his instrument, a standard that would not only satisfy listeners but that would overcome all the flaws of traditional instruments.

In this example, there is styling applied to flag errors, with a heavy border around the most severe level. I'm sure a simple Javascript could also do something like display the value of the tqnote attribute on hover, so this could be easily made more sophisticated using just basic browser technology.

Any comments?

If there are no objections, I will start writing this all up in the spec format and ask for review of that, at which point I am sure there will be some thorny issues.

-Arle

Received on Thursday, 2 August 2012 10:35:46 UTC