Re: Atb.: [action 265] data category specific confidence scores from Felix Sasaki on 2012-11-21 (public-multilingualweb-lt@w3.org from November 2012)

From: Felix Sasaki <fsasaki@w3.org>
Date: Wed, 21 Nov 2012 10:51:11 +0100
To: Mārcis Pinnis <marcis.pinnis@Tilde.lv>
CC: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, dave lewis <dave.lewis@cs.tcd.ie>, "tadej.stajner@ijs.si" <tadej.stajner@ijs.si>
Message-ID: <50ACA40F.9070501@w3.org>
Hi Marcis, all,

Am 21.11.12 09:23, schrieb Mārcis Pinnis:
>
> Hi Felix,
>
> If the base is xs:decimal, then - yes.
>
> The xs:decimal requires that the decimal separator is a point and not 
> a comma as used in some (including Latvian) locales... If that is not 
> explicitly required, users could (mis)use locale specific formats, but 
> that would cause compatibility problems. So ... (in my opinion) it is 
> important that the format is a restricted (inclusive interval of 
> [0;1]) variant of “xs:decimal”.
>


Thanks - I added this sentence to the definition of disambigConfidence, 
mtConfidence and termConfidence:

"The value follows the XML Schema decimal data type with the 
constraining facets minInclusive set to 0 and maxInclusive set to 1."

Hope that this will be OK,

Felix

> Best regards,
>
> Mārcis ;o)
>
> *From:*Felix Sasaki [mailto:fsasaki@w3.org]
> *Sent:* Wednesday, November 21, 2012 10:15 AM
> *To:* Mārcis Pinnis
> *Cc:* public-multilingualweb-lt@w3.org; dave lewis; tadej.stajner@ijs.si
> *Subject:* Re: Atb.: [action 265] data category specific confidence scores
>
> Thanks, Marcis. Just to be sure: you are saying that the mtConfidence, 
> disambigConfidence and termConfidence should follow a definition like 
> this:
>
> <simpleType name='confidence'>
>    <restriction base='decimal'>
>      <minInclusive value='0'/>
>      <maxInclusive value='1'/>
>    </restriction>
> </simpleType>
>
>
> Best,
>
> Felix
>
> Am 21.11.12 08:55, schrieb Mārcis Pinnis:
>
>     Hi Felix,
>
>       
>
>     I went over the wording, there is only one thing that I find misleading and not well disambiguated.
>
>       
>
>     The following are all valid mathematical representations of rational numbers: 0.3; 1/3; 0.(3) ... and with locale specifics: 0,3; 0,(3)
>
>       
>
>     If the format specifies "a rational number in the interval 0 to 1" then all these I would assume valid (as a mathematician). Maybe the wording should be changed to: "a finite decimal number" (this would also hardcode the number base...). Maybe it is irrelevant, maybe not, but if there is already a limitation of 0 to 1, then I think there should be also a clear indication to the base and format (maybe it is possible to borrow the definition of xs:decimalhttp://www.w3.org/TR/xmlschema-2/#decimal).
>
>       
>
>     Best regards,
>
>     Mārcis ;o)
>
>       
>
>       
>
>     -----Original Message-----
>
>     From: Felix Sasaki [mailto:fsasaki@w3.org]
>
>     Sent: Tuesday, November 20, 2012 11:07 PM
>
>     To:public-multilingualweb-lt@w3.org  <mailto:public-multilingualweb-lt@w3.org>; dave lewis;tadej.stajner@ijs.si  <mailto:tadej.stajner@ijs.si>; Mārcis Pinnis
>
>     Subject: Re: Atb.: [action 265] data category specific confidence scores
>
>       
>
>     Hi Dave, Marcis, Tadej, all,
>
>       
>
>     Am 14.11.12 19:27, schrieb Tadej Stajner:
>
>         Hi, Dave, Marcis,
>
>         (see below)
>
>           
>
>         On 11/14/2012 5:47 PM, Dave Lewis wrote:
>
>             thanks for the feedback, comment inline.
>
>               
>
>             On 13/11/2012 19:56, Mārcis Pinnis wrote:
>
>                 Hi Dave,
>
>                   
>
>                 1) I support your suggestion as drafted in the attachment.
>
>                 2) Although I believe there is a typing mistake:
>
>                   
>
>                 <p>And he said: you need a new <quote its:term="yes"
>
>                 its-info-term-ref=”http://www.directron.com/motherboards1.html”  <http://www.directron.com/motherboards1.html%94>  
>
>                 its-term-confidence=”0.5”>motherboard</quote></p>
>
>                   
>
>                 I believe its-info-term-ref should actually be its-term-info-ref?!
>
>               
>
>             thanks for spotting that, we'll fix it.
>
>       
>
>     This should be fixed now at
>
>       
>
>     http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-terms-selector-4
>
>       
>
>               
>
>                   
>
>                 3) Also, just a comment (our systems won't be affected, but...) -
>
>                 why do you want to restrict the values to be from 0 to 1? In
>
>                 statistics it is quite common to use also LOG-scale probabilities
>
>                 (because of otherwise small numbers in some cases). Is it necessary
>
>                 to restrict users to a 0 to 1 interval? I would suggest leaving the
>
>                 decision up to the user's. Also - the tools will have to be
>
>                 identified anyway. This means that the users will be able to
>
>                 identify (if needed) from the systems how to parse (understand) the
>
>                 confidence scores. This is a general question that applies to other
>
>                 confidence scores as well.
>
>               
>
>             In general, we do not attach inter-tool significance to the
>
>             confidence scores, hence the requirement to specify the tool using
>
>             its-tools- ref. Normalising the score 0-1 is therefore not intended
>
>             to support inter-tool comparisons, but more give the the presenting
>
>             software a stable range/value to display.
>
>           
>
>         On that note, I'd suggest explicitly adding a sentence that the scores
>
>         are comparable only in the context of the same tool. It might be
>
>         obvious to us, but it's an important point.
>
>       
>
>       
>
>     I tried to add that to the 1st paragraph athttp://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation
>
>     for terminology, mt confidence and disambiguation.
>
>       
>
>     Best,
>
>       
>
>     Felix
>
>       
>
>           
>
>         -- Tadej
>
>           
>
>           
>
>               
>
>             For Mt confidence score the concerned implementers suggested 0..1,
>
>             the use of log-scale didn't come up. So for no deeper reason that
>
>             consistency i'd then suggest we keep the same for term and disambig
>
>             confidence scores, unless there is a pressing reason to do otherwise.
>
>               
>
>             cheers,
>
>             Dave
>
>               
>
>                 4) I agree that in the current proposal it would not be reasonable
>
>                 to add a confidence score as in multiple domain scenario it would be
>
>                 misleading/wrong and it would require a different solution (For
>
>                 instance, similar to how domains can be marked).
>
>               
>
>                 Best regards,
>
>                 Mārcis ;o)
>
>                   
>
>                 ________________________________________
>
>                 No: Dave Lewis [dave.lewis@cs.tcd.ie  <mailto:dave.lewis@cs.tcd.ie>]
>
>                 Nosūtīts: otrdiena, 2012. gada 13. novembrī 19:30
>
>                 Kam: Multilingual Web LT Public List
>
>                 Tēma: Fwd: [action 265] data category specific confidence scores
>
>                   
>
>                 Hi all,
>
>                 To try and wrap up this point:
>
>                   
>
>                 Summary  of Discussion so far:
>
>                 1) text analytics annotation was proposed as a way of offering a
>
>                 confidence score for text analytics results. As with mtconfidence
>
>                 score, the tools annotaiton is now covered by the itsTool feature,
>
>                 but the proposal for confidence scores remains
>
>                   
>
>                 2) Marcis pointed out, using real world terminology use cases, that
>
>                 we may have several annotations operating on the same fragment, so
>
>                 applying a confidence score to different text analytics annotations
>
>                 with a single data category won't work in these cases because of
>
>                 complete override.
>
>                   
>
>                 Also, if we used text analytics annotation with annotation from
>
>                 other data categories we are breaking our 'no dependencies between
>
>                 data category rules'.
>
>                   
>
>                 3) We could overcome the complete override problems using standoff
>
>                 mark up as in loc quality issue and provenance. But as confidence
>
>                 score would be different for each annotated fragment, that would
>
>                 result in very big stand-off records, and we would still be breaking
>
>                 the data cat dependencies rule. So this doesn't seem a realistic
>
>                 option
>
>                   
>
>                 4) so the suggestion discussed in Lyon was to drop  text analytics
>
>                 annotation altogether as a separate data category and focus on
>
>                 adding confidence attributes to the existing data categories that
>
>                 would benefit from it.
>
>                   
>
>                 so.....
>
>                   
>
>                 Proposal:
>
>                 I therefore suggest the following and we need your feedback by
>
>                 friday 16th Nov so we can wrap this up on the monday call!
>
>                   
>
>                 For those extended with confidence score (terminology,
>
>                 disambiguation) please express your support and any comments by
>
>                 friday - if we don't receive any we will definitely drop these
>
>                 suggestions. Marcis, Tadej in particular, please consider review these.
>
>                   
>
>                 For exclusions (domain, localizationQualityissue), this is your last
>
>                 chance to counter-argue in favour of including, otherwise assume
>
>                 these are dropped also.
>
>                   
>
>                 i) confidence for terminology: as suggested by Marcis
>
>                 (http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012N
>
>                 ov/0028.html), revised data category as word revisions attached
>
>                 (addition to local definition, note on its-tools and example  38)
>
>                   
>
>                 ii) confidence for disambiguation: revised data category as word
>
>                 revisions attached (addition to local definition, note on its-tools
>
>                 and ex 52)
>
>                   
>
>                 iii) domain: I suggest excluding this as an annotation to which we
>
>                 attach a confidence score. Its not clear that the use of text
>
>                 analytics to identify domain, while feasible, actually represents a
>
>                 real use case for interoperability mark-up. If use it would probably
>
>                 be internalized by the MT engine. Also, since there are multiple
>
>                 domain values the semantics of a single confidence score is unclear.
>
>                   
>
>                 iv) localizationQualityIssue: i suggest also excluding this as an
>
>                 annotation to which we attach confidence scores. The use of
>
>                 statistical text analytics doesn't seem common for QA tasks. One
>
>                 exception is the recent innovation by digital lingusitics whose
>
>                 Review Sentinel product ranks translation but a TA assessment for QA
>
>                 purposes - but this innovative and not current practice, so its
>
>                 probably not yet a concrete use case.
>
>                   
>
>                 cheers,
>
>                 Dave
>
>                   
>
>                   
>
>                   
>
>                   
>
>                   
>
>               
>
>               
>
>           
>
>           
>
>           
>
>       
>
>       
>
Received on Wednesday, 21 November 2012 09:51:34 UTC