Re: Call for consensus - Localization Quality Précis (related to [ISSUE-34]) from Felix Sasaki on 2012-08-24 (public-multilingualweb-lt@w3.org from August 2012)

From: Felix Sasaki <fsasaki@w3.org>
Date: Fri, 24 Aug 2012 10:50:15 +0200
To: Yves Savourel <ysavourel@enlaso.com>, Des Oates <doates@adobe.com>
Cc: public-multilingualweb-lt@w3.org
Message-ID: <CAL58czrUXfhQZmsKaj3nmNLXQDPY0wj09mMw53Zoa4XtGviNLQ@mail.gmail.com>
Hi Yves, all,

we had similar issues during the development of ITS 1.0 before: what if
existing information related to translation is not expressed via "yes" or
"no", but via other and maybe even more values? With the global rules
mechanism, you can do something like this:

<its:translateRule selector="//*[@translation-need &lt;= 0.5]"
translate="no"/>
<its:translateRule selector="//*[@translation-need &gt;= 0.5]"
translate="yes"/>

This assumes that in the "translation-need" attribute values below or equal
0.5 carry the same semantics as ITS translate "no". The same for more than
0.5 and translate "yes".

Now, I assume that in community based workflows the situation would be
similar: you would have a small set of values (e.g. 0-5). These could be
mapped to the score we envisage, with six global rules:

<its:locQualityScoreRuleselector="/doc[@my-own-score=5]"
locQualityPrecisScore="100"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=4]"
locQualityPrecisScore="80"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=3]"
locQualityPrecisScore="60"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=2]"
locQualityPrecisScore="40"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=1]"
locQualityPrecisScore="20"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=0]"
locQualityPrecisScore="0"/>

So if the main type of workflows you have in mind are with values like
above, I think both requirements (fixed set of values in ITS and different
values in input data) could be fulfilled.

Best,

Felix

2012/8/23 Yves Savourel <ysavourel@enlaso.com>

> Hi Des, all,
>
> > ...Can I provide my own scoring range if I supply my
> > own locQualityPrecisProfileRef?
> > ...
> > In summary, I don't think it makes sense to constrain
> > the values of locQualityPreciseScore* if a user
> > provides a locQualityPrecisProfile* that can provide
> > semantic meaning to the score values that lie
> > outside the [0-100] range.
>
> This looks like the ever-problematic issue of providing either perfect
> interoperability for few or partial for all.
>
> I would agree with you if locQualityPrecisProfile* was pointing to a
> standardized resource where the user agent, *without other knowledge* could
> parse to discover what to expect as value/range.
>
> But without a standard profile, if we allow to have different types of
> values based on just a URI, only the tools with a specific knowledge of
> what that URI means for the values will be able to interact with the
> values. The other will have no clue.
>
> The system with a simple 0.0 to 100.0 range is obviously also flawed
> because one has to map the original values to a given range. That that may
> be tricky.
>
> How would you map negative/positive N to 0-100?
> I'm guessing there has to be some high and low limit, even if it's MAXINT.
>
> In that case you could use a function such as the one we use in Okapi:
>
> /**
>  * Given an integer range and a value normalize that value on a scale
> between 0 and 100.
>  * @param low - lowest value of the range
>  * @param high - highest value of the range
>  * @param value - the value that needs to be mapped to 0-100
>  * @return mapped value, a number between 0 and 100
>  */
> Public static int normalizeRange(float low, float high, float value) {
>         float m = 0.0f; // low value  of map to range
>         float n = 100.0f; // high value of map to range
>         return (int) (m +((value-low)/(high-low) * (n-m)));
> }
>
> Obviously we are losing precision if the original range is wider than 100.
> For example for an original range of -100 to 100 you get:
>
> normalizeRange(-100, 100, -15) == 42
> normalizeRange(-100, 100, -16) == 42
>
> going to a decimal value would allow more precision.
>
>
> The other issue is that mapping back from 0-100 to negative/positive N is
> not going to work perfectly.
>
> But the idea is that all tools will get a meaningful value. Yes, with a
> loss of precision in some cases, but I would expect that for Score this is
> acceptable.
>
> This said, having the profile declared should help tools knowledgeable of
> that profile to
>
> My suggestion would be to define your own attribute in addition to
> locQualityPrecisScore, with the native value.
>
>
> a) If you have your own value in the ITS score attribute, with your
> profile declared:
>
> - tools knowing about your system can use the value directly.
> - tools not knowing about your system cannot use the value safely.
>
>
> b) If you have only the standardized ITS score and your profile declared:
>
> - tools knowing about your system can use the ITS value.
> - tools knowing about your system can (in some cases) map the ITS value
> back to the native one.
> - tools not knowing about your system can use an ITS value that is
> meaningful.
> - tools not knowing about your system can modify the ITS value in a way
> that can be (for some system) mappable back to your system.
>
>
> c) If you have both the standardized ITS score and your own value and your
> profile declared:
>
> - tools knowing about your system can use the ITS value.
> - tools knowing about your system can use the real native value.
> - tools knowing about your system can update both values properly.
> - tools not knowing about your system can use an ITS value that is
> meaningful.
> - tools not knowing about your system can modify the ITS value in a way
> that can be (for some system) mappable back to your system.
>
> Writing this email, made me think: maybe there is a range other than
> 0.0-100.0 that is more suitable for mapping back and forth to other ranges?
> Maybe -100 to 100? Anyone good in math has a suggestion?
>
> I don't necessarily want 0-100, but I think having a standardized range is
> important.
>
> Cheers,
> -yves
>
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow
Received on Friday, 24 August 2012 08:50:49 UTC