- From: Felix Sasaki <fsasaki@w3.org>
- Date: Fri, 10 Aug 2012 17:16:49 +0200
- To: public-multilingualweb-lt@w3.org
- Message-ID: <CAL58czrH+cvWc-rqV0=P-AV7Gjs7D7YNrb6aY187wO0ge3FQFg@mail.gmail.com>
Hi Yves all, co-chair hat on, see question below, which is relevant to all implementors of quality. 2012/8/10 Yves Savourel <ysavourel@enlaso.com> > Hi Arle,**** > > ** ** > > Comments inline.**** > > ** ** > > ** ** > > *From:* Arle Lommel [mailto:arle.lommel@dfki.de] > *Sent:* Friday, August 10, 2012 4:46 AM > *To:* Multilingual Web LT Public List > *Subject:* [ISSUE-34] "Pieces of information" for quality for agreement*** > * > > ** ** > > Based on the discussion in yesterday's meeting, I am sending the following > list of "pieces of information" to see if we have agreement on them. When > we have agreement on which pieces we are implementing, then we can return > to the actual structure of how they are represented.**** > > ** ** > > It seems that we will need two separate data categories, locQualityProfile > and locQualityIssue. Issues in green are ones where we do not seem to have > consensus on adopting them.)**** > > ** ** > > YS> Its seems your note about locQualityIssueProfileRef and > locQualityProfileDescrip, at the bottom, implies that if we have two data > categories, the implementers must implement both. That would be a first in > ITS where normally data categories can work alone.**** > > ** ** > > ** ** > > The structure for *locQualityProfile* is:**** > > - *locQualityProfileDescrip*: A QName that provides a prefix for the > profile (which can be used to refer to the profile) and a URI where more > information about the tool/profile can be found. (Default: human:human) > **** > - *locQualityProfileScore* (optional): A score as generated by the > tool or model referenced in locQualityProfileDescription. No default value > defined.**** > - *locQualityProfileThreshold *(optional): Defines what score > constitutes a "passing" score according to the model/tool used.**** > > ** ** > > *Open question: can the above be treated as provenance or otherwise > unified with text analytics, which has a similar need as this category (see > Issue-42).***** > > ** ** > > The structure for *locQualityIssue* is *at least* one of the following:*** > * > > ** ** > > - *locQualityIssueProfileRef*: Contains a text pointer to a > locQualityProfileDescrip-defined prefix to bind locQualityIssue to a > specific profile. Default is human. Normal inheritance applies. For > example, if the code <body its-loc-quality-issue-profile="something">appears in an HTML file, then all locQualityIssue instances within the body > would inherit the value of "something" unless it is specifically > overridden. (I realize this is already implementation-specific, but it > illustrates the point.)**** > - *locQualityIssueType*: A value from the picklist that identifies the > generic issue type. (Default: unclassified)**** > - *locQualityIssueCode*: A tool-specific code that corresponds to the > value of locQualityIssueType. *(Note: Yves now thinks this is > unnecessary because the values are not constrained. Arle thinks this is > needed even if the values are not constrained…)***** > > YS> Note that if we drop locQualityIssueType we don’t need QNames, > locQualityIssueprofileRef can be just a URI and can truly separate profile > from issues.**** > > - *locQualityIssueComment*: A human-readable note about the issue**** > - *locQualityIssueSeverity*: A value corresponding to the severity of > the error. *(The initial proposal was for this to be a numeric value, > but Des and David both argue that this should be a free value. If this is > the case, there is no guarantee of interoperability at all between values. > E.g., what would a tool make of a value such as "severe" if there is no > correlate to know what severe means in its own system. It is conceivable > that the document pointed to in the profile could define values, but we are > not defining what the profile itself looks like.)***** > > YS> IMO the values for locQualityIssueSeverity should be 0-100 or some > similar numeric range. I think most forms of severity can be mapped to > that. For example CheckMate uses “high”, “medium” and “low” (actually we > use colors, but internally it’s a 3-values system), I think I can map those > display to ranges of values. Sure the implementation will require some > tune-up to store the ITS original values to make sure they are preserved, > but that’s the price we’ll happily pay for interoperability.**** > > As Arle points out, using a free value would break most severity-related > operations on issues coming from different tools. For example how to sort > them? **** > > - *locQualityissueSuggestion*: A machine-readable suggestion for how > to resolve the issue. *(Felix is concerned that the complexity of a > machine-readable solution might be too high)***** > - *locQualityIssueStatus*: An indicator of whether an issue is active > or resolved. Possible values: active|resolved|rejected**** > > YS> I don’t agree with the current locQualityIssueStatus. IMO a simple > enabled/disabled flag is a better way to go. It allows the necessary means > to handle false-positives when doing recurring checks. I wouldn’t know what > to do with a workflow-type status.**** > > - *locQualityIssueStage*: An indication of where in a workflow the > issue is. *(Des notes that we do not want fixed values for this. Arle > questions whether it is needed if we have the issue stage since open values > are not interoperable.)***** > > YS> I wouldn’t know what to do with that one.**** > > - *locQualityIssueAgent*: An identifier for the agent that produced > the issue. Possible values: human|machine *(Arle: if we have the > locQualityIssueProfileRef, I think we don't need this since that is a more > robust solution.)***** > > ** ** > > To move forward with this, if you are considering implementing these data > categories, which pieces do you consider essential enough to implement?As > long as we have the two parts (a profile and an issue), it seems that the > *locQualityIssueProfileRef* (what a horrible name!) and the * > locQualityProfileDescrip* are required since the structure falls apart > without them. But beyond those, will we have commitments to implement any > of these particular pieces?**** > > ** ** > > YS> So as far as implementation, here is my best guess:**** > > ** ** > > Checkmate does not have a use for the locQualityProfile data category, so > we might implemented if time/resource permit, but we would limit that to > the ITS engine library and not use it ourselves.**** > > ** ** > > We would certainly be very keen in implementing the locQualityIssue data > category.**** > > ** ** > > The minimal attributes IMO would be: locQualityIssueComment and > locQualityIssueType.**** > > ** ** > > locQualityIssueSeverity would be a big plus (as long as the values are > interoperable)**** > > ** ** > > locQualitySuggestion would be nice too.**** > > ** ** > > A locQualityIssueEnabled=’yes/no’ instead of the locaQualityIssueStatus > would be nice as well.**** > > ** ** > > And last locQualityIssueProfileRef (ugly name indeed).**** > > ** ** > > Any other information we would handle because it’s part of the data > category, but we would not use them. > Who would implement - in addition to Enlaso - locQualityIssueComent, locQualityIssueType, locQualityIssueSeverity, locQualitySuggestion, locQualityIssueEnabled=’yes/no’, locQualityIssueProfileRef? A subset of these is fine too. We basically need to know: for which of these items would we have at least two implementations? Another question: who would need and implement additional items? Which one? Best, Felix > **** > > ** ** > > I hope this helps,**** > > -yves**** > > ** ** > -- Felix Sasaki DFKI / W3C Fellow
Received on Friday, 10 August 2012 15:17:15 UTC