RE: Call for consensus - Localization Quality Précis (related to [ISSUE-34])

Hi Des, all,

> ...Can I provide my own scoring range if I supply my 
> own locQualityPrecisProfileRef?
> ...
> In summary, I don't think it makes sense to constrain 
> the values of locQualityPreciseScore* if a user 
> provides a locQualityPrecisProfile* that can provide 
> semantic meaning to the score values that lie 
> outside the [0-100] range.

This looks like the ever-problematic issue of providing either perfect interoperability for few or partial for all.

I would agree with you if locQualityPrecisProfile* was pointing to a standardized resource where the user agent, *without other knowledge* could parse to discover what to expect as value/range.

But without a standard profile, if we allow to have different types of values based on just a URI, only the tools with a specific knowledge of what that URI means for the values will be able to interact with the values. The other will have no clue.

The system with a simple 0.0 to 100.0 range is obviously also flawed because one has to map the original values to a given range. That that may be tricky.

How would you map negative/positive N to 0-100?
I'm guessing there has to be some high and low limit, even if it's MAXINT.

In that case you could use a function such as the one we use in Okapi:

/**
 * Given an integer range and a value normalize that value on a scale between 0 and 100.
 * @param low - lowest value of the range
 * @param high - highest value of the range
 * @param value - the value that needs to be mapped to 0-100
 * @return mapped value, a number between 0 and 100
 */ 
Public static int normalizeRange(float low, float high, float value) {
 float m = 0.0f; // low value  of map to range
 float n = 100.0f; // high value of map to range
 return (int) (m +((value-low)/(high-low) * (n-m)));
}

Obviously we are losing precision if the original range is wider than 100. For example for an original range of -100 to 100 you get:

normalizeRange(-100, 100, -15) == 42
normalizeRange(-100, 100, -16) == 42

going to a decimal value would allow more precision.


The other issue is that mapping back from 0-100 to negative/positive N is not going to work perfectly.

But the idea is that all tools will get a meaningful value. Yes, with a loss of precision in some cases, but I would expect that for Score this is acceptable.

This said, having the profile declared should help tools knowledgeable of that profile to 

My suggestion would be to define your own attribute in addition to locQualityPrecisScore, with the native value.


a) If you have your own value in the ITS score attribute, with your profile declared:

- tools knowing about your system can use the value directly.
- tools not knowing about your system cannot use the value safely.


b) If you have only the standardized ITS score and your profile declared:

- tools knowing about your system can use the ITS value.
- tools knowing about your system can (in some cases) map the ITS value back to the native one.
- tools not knowing about your system can use an ITS value that is meaningful.
- tools not knowing about your system can modify the ITS value in a way that can be (for some system) mappable back to your system.


c) If you have both the standardized ITS score and your own value and your profile declared:

- tools knowing about your system can use the ITS value.
- tools knowing about your system can use the real native value.
- tools knowing about your system can update both values properly.
- tools not knowing about your system can use an ITS value that is meaningful.
- tools not knowing about your system can modify the ITS value in a way that can be (for some system) mappable back to your system.

Writing this email, made me think: maybe there is a range other than 0.0-100.0 that is more suitable for mapping back and forth to other ranges? Maybe -100 to 100? Anyone good in math has a suggestion?

I don't necessarily want 0-100, but I think having a standardized range is important.

Cheers,
-yves

Received on Thursday, 23 August 2012 17:35:59 UTC