W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > April 2012

Re: [ACTION-17] Add section on confidence score on NLP (e.g. TA and MT) component output

From: Arle Lommel <arle.lommel@gmail.com>
Date: Tue, 3 Apr 2012 14:11:18 +0200
Message-Id: <A239506C-5F2A-45CD-89E4-8A3E23F814A0@gmail.com>
Cc: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>
To: "dave.lewis@cs.tcd.ie" <dave.lewis@cs.tcd.ie>
Dave, your proposal sounds good. I've proposed elsewhere in a few cases to use the same numerical data model since there are often multiple tiered data models used for these sorts of things. Only by using a numerical equivalence can we hope to support blind mapping between them, although the complexities of such mapping (e.g., how to deal with a score of 0.5 if your relevant score points are 0.33... and 0.66...) are not obvious.

 But in general, for anything where there is not an obvious set of allowable values, I think it makes sense to consistently use that numerical scale.

Arle

--
Arle Lommel
Phone (U.S.): +1.707.709.8650
Skype: arle_lommel
LinkedIn: http://www.linkedin.com/in/arlelommel

On Apr 3, 2012, at 13:49, Dave Lewis <dave.lewis@cs.tcd.ie> wrote:

> Hi all,
> 
> To try and keep this issue specific to the technologies concerned, I've
> added a separate comment to the 'text analysis annotation' requirement
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#text_analysis_annotation
> as this would seem to be the relevant data category currently.
> 
> There is already an explicit data category requirement related to
> confidence score for MT
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#mt_confidence_score
> 
> I added to this to include a discussion of the data model, suggesting it
> be a numeric value between 0.0 and 1.0.
> 
> I will close this action.
> Dave
> 
> 
Received on Tuesday, 3 April 2012 12:11:53 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:24:54 UTC