Re: [All implementors question] Re: [ISSUE-34] "Pieces of information" for quality for agreement from Phil Ritchie on 2012-08-12 (public-multilingualweb-lt@w3.org from August 2012)

From: Phil Ritchie <philr@vistatec.ie>
Date: Sun, 12 Aug 2012 21:49:30 +0100
To: Felix Sasaki <fsasaki@w3.org>
Cc: public-multilingualweb-lt@w3.org
Message-ID: <OFD7FE89EB.3BC77108-ON80257A58.00706313-80257A58.00726612@vistatec.ie>
All

I plan to implement both data categories: locQualityProfile and 
locQualityIssue. I'm also hoping to be able to involve McAfee but their 
commitment will probably happen at Call for Consensus.

For locQualityIssue the important attributes to me are 
locQualityIssueProfileRef, locQualityIssueType, locQualityIssueSeverity 
and locQualityIssueComment.

I'm still considering many of the green and red questions below but at the 
moment I'm leaning toward the most interoperable implementations.

Phil.





From:   Felix Sasaki <fsasaki@w3.org>
To:     public-multilingualweb-lt@w3.org, 
Date:   10/08/2012 16:17
Subject:        [All implementors question] Re: [ISSUE-34] "Pieces of 
information"  for quality for agreement



Hi Yves all,

co-chair hat on, see question below, which is relevant to all implementors 
of quality.

2012/8/10 Yves Savourel <ysavourel@enlaso.com>
Hi Arle,
 
Comments inline.
 
 
From: Arle Lommel [mailto:arle.lommel@dfki.de] 
Sent: Friday, August 10, 2012 4:46 AM
To: Multilingual Web LT Public List
Subject: [ISSUE-34] "Pieces of information" for quality for agreement
 
Based on the discussion in yesterday's meeting, I am sending the following 
list of "pieces of information" to see if we have agreement on them. When 
we have agreement on which pieces we are implementing, then we can return 
to the actual structure of how they are represented.
 
It seems that we will need two separate data categories, locQualityProfile 
and locQualityIssue. Issues in green are ones where we do not seem to have 
consensus on adopting them.)
 
YS> Its seems your note about locQualityIssueProfileRef and 
locQualityProfileDescrip, at the bottom, implies that if we have two data 
categories, the implementers must implement both. That would be a first in 
ITS where normally data categories can work alone.
 
 
The structure for locQualityProfile is:
locQualityProfileDescrip: A QName that provides a prefix for the profile 
(which can be used to refer to the profile) and a URI where more 
information about the tool/profile can be found. (Default: human:human)
locQualityProfileScore (optional): A score as generated by the tool or 
model referenced in locQualityProfileDescription. No default value 
defined.
locQualityProfileThreshold (optional): Defines what score constitutes a 
"passing" score according to the model/tool used.
 
Open question: can the above be treated as provenance or otherwise unified 
with text analytics, which has a similar need as this category (see 
Issue-42).
 
The structure for locQualityIssue is at least one of the following:
 
locQualityIssueProfileRef: Contains a text pointer to a 
locQualityProfileDescrip-defined prefix to bind locQualityIssue to a 
specific profile. Default is human. Normal inheritance applies. For 
example, if the code <body its-loc-quality-issue-profile="something"> 
appears in an HTML file, then all locQualityIssue instances within the 
body would inherit the value of "something" unless it is specifically 
overridden. (I realize this is already implementation-specific, but it 
illustrates the point.)
locQualityIssueType: A value from the picklist that identifies the generic 
issue type. (Default: unclassified)
locQualityIssueCode: A tool-specific code that corresponds to the value of 
locQualityIssueType. (Note: Yves now thinks this is unnecessary because 
the values are not constrained. Arle thinks this is needed even if the 
values are not constrained…)
YS> Note that if we drop locQualityIssueType we don’t need QNames, 
locQualityIssueprofileRef can be just a URI and can truly separate profile 
from issues.
locQualityIssueComment: A human-readable note about the issue
locQualityIssueSeverity: A value corresponding to the severity of the 
error. (The initial proposal was for this to be a numeric value, but Des 
and David both argue that this should be a free value. If this is the 
case, there is no guarantee of interoperability at all between values. 
E.g., what would a tool make of a value such as "severe" if there is no 
correlate to know what severe means in its own system. It is conceivable 
that the document pointed to in the profile could define values, but we 
are not defining what the profile itself looks like.)
YS> IMO the values for locQualityIssueSeverity should be 0-100 or some 
similar numeric range. I think most forms of severity can be mapped to 
that. For example CheckMate uses “high”, “medium” and “low” (actually we 
use colors, but internally it’s a 3-values system), I think I can map 
those display to ranges of values. Sure the implementation will require 
some tune-up to store the ITS original values to make sure they are 
preserved, but that’s the price we’ll happily pay for interoperability.
As Arle points out, using a free value would break most severity-related 
operations on issues coming from different tools. For example how to sort 
them? 
locQualityissueSuggestion: A machine-readable suggestion for how to 
resolve the issue. (Felix is concerned that the complexity of a 
machine-readable solution might be too high)
locQualityIssueStatus: An indicator of whether an issue is active or 
resolved. Possible values: active|resolved|rejected
YS> I don’t agree with the current locQualityIssueStatus. IMO a simple 
enabled/disabled flag is a better way to go. It allows the necessary means 
to handle false-positives when doing recurring checks. I wouldn’t know 
what to do with a workflow-type status.
locQualityIssueStage: An indication of where in a workflow the issue is. 
(Des notes that we do not want fixed values for this. Arle questions 
whether it is needed if we have the issue stage since open values are not 
interoperable.)
YS> I wouldn’t know what to do with that one.
locQualityIssueAgent: An identifier for the agent that produced the issue. 
Possible values: human|machine (Arle: if we have the 
locQualityIssueProfileRef, I think we don't need this since that is a more 
robust solution.)
 
To move forward with this, if you are considering implementing these data 
categories, which pieces do you consider essential enough to implement?As 
long as we have the two parts (a profile and an issue), it seems that the 
locQualityIssueProfileRef (what a horrible name!) and the 
locQualityProfileDescrip are required since the structure falls apart 
without them. But beyond those, will we have commitments to implement any 
of these particular pieces?
 
YS> So as far as implementation, here is my best guess:
 
Checkmate does not have a use for the locQualityProfile data category, so 
we might implemented if time/resource permit, but we would limit that to 
the ITS engine library and not use it ourselves.
 
We would certainly be very keen in implementing the locQualityIssue data 
category.
 
The minimal attributes IMO would be: locQualityIssueComment and 
locQualityIssueType.
 
locQualityIssueSeverity would be a big plus (as long as the values are 
interoperable)
 
locQualitySuggestion would be nice too.
 
A locQualityIssueEnabled=’yes/no’ instead of the locaQualityIssueStatus 
would be nice as well.
 
And last locQualityIssueProfileRef (ugly name indeed).
 
Any other information we would handle because it’s part of the data 
category, but we would not use them.


Who would implement - in addition to Enlaso - locQualityIssueComent, 
locQualityIssueType, locQualityIssueSeverity, locQualitySuggestion, 
locQualityIssueEnabled=’yes/no’, locQualityIssueProfileRef?
A subset of these is fine too. We basically need to know: for which of 
these items would we have at least two implementations?

Another question: who would need and implement additional items? Which 
one?

Best,

Felix

 
 
I hope this helps,
-yves
 



-- 
Felix Sasaki
DFKI / W3C Fellow



************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the sender immediately by e-mail.

www.vistatec.com
************************************************************
Received on Sunday, 12 August 2012 20:50:03 UTC