Re: Question on toolsRef for disambigConfidence and termConfidence (Re: Atb.: [action 265] data category specific confidence scores) from Felix Sasaki on 2012-11-21 (public-multilingualweb-lt@w3.org from November 2012)

From: Felix Sasaki <fsasaki@w3.org>
Date: Wed, 21 Nov 2012 11:04:48 +0100
To: public-multilingualweb-lt@w3.org
Message-ID: <50ACA740.4050706@w3.org>
Thanks, Tadej and Marci - I updated the examples.

- Felix

Am 21.11.12 10:55, schrieb Tadej Stajner:
> Hi, Marcis, all,
> I can confirm that for the disambiguation example - they pre-date the 
> inclusion of the toolsRef mechanism. The rule itself is fine - a score 
> needs to be in the scope of a tool.
> -- Tadej
>
> On 11/21/2012 8:21 AM, Mārcis Pinnis wrote:
>>
>> Hi Felix,
>>
>> I assume (following the definition) that the examples (40 and 54) are 
>> not correct. It probably has happened because the term confidence and 
>> also the disambiguation examples have been created before agreeing on 
>> the toolsRef attribute.
>>
>> Best regards,
>>
>> Mārcis ;o)
>>
>> *From:*Felix Sasaki [mailto:fsasaki@w3.org]
>> *Sent:* Wednesday, November 21, 2012 8:03 AM
>> *To:* public-multilingualweb-lt@w3.org; dave lewis; 
>> tadej.stajner@ijs.si; Mārcis Pinnis
>> *Subject:* Question on toolsRef for disambigConfidence and 
>> termConfidence (Re: Atb.: [action 265] data category specific 
>> confidence scores)
>>
>> Hi all again,
>>
>> looking into the "confidence score" attributes again, I saw these 
>> three paragraphs:
>>
>> " Any node selected by the MT Confidence 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#mtconfidence> 
>> data category MUST 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#rfc2119> 
>> be contained in an element with the |toolsRef| (or in HTML5, 
>> |its-tools-ref|) attribute specified for the MT Confidence 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#mtconfidence> 
>> data category. For more information, see Section 5.8: ITS Tools 
>> Annotation 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation>."
>>
>> " Any node selected by the terminology data category with the 
>> |termConfidence| attribute specified MUST 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#rfc2119> 
>> be contained in an element with the |toolsRef| (or in HTML5 
>> |its-tools-ref|) attribute specified for the Terminology 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#terminology> 
>> data category. See Section 5.8: ITS Tools Annotation 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation> 
>> for more information."
>>
>> " Any node selected by the disambiguation 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#Disambiguation> 
>> data category with the |disambigConfidence| attribute specified MUST 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#rfc2119> 
>> be contained in an element with the |toolsRef| (or in HTML5 
>> |its-tools-ref|) attribute specified for the disambiguation 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#Disambiguation> 
>> data category. For more information, see Section 5.8: ITS Tools 
>> Annotation 
>> <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation>."
>>
>> However, the examples for  termConfidence (ex. 40) and 
>> disambigConfidence (ex. 54) show no toolsRef (or its-tools-ref) 
>> attribute. Are the examples wrong or is toolsRef / its-tools-ref 
>> optional for termConfidence and disambigConfidence?
>>
>> Thanks,
>>
>> Felix
>>
>> Am 20.11.12 22:07, schrieb Felix Sasaki:
>>
>>     Hi Dave, Marcis, Tadej, all,
>>
>>     Am 14.11.12 19:27, schrieb Tadej Stajner:
>>
>>     Hi, Dave, Marcis,
>>     (see below)
>>
>>     On 11/14/2012 5:47 PM, Dave Lewis wrote:
>>
>>     thanks for the feedback, comment inline.
>>
>>     On 13/11/2012 19:56, Mārcis Pinnis wrote:
>>
>>     Hi Dave,
>>
>>     1) I support your suggestion as drafted in the attachment.
>>     2) Although I believe there is a typing mistake:
>>
>>     <p>And he said: you need a new <quote its:term="yes"
>>     its-info-term-ref=”http://www.directron.com/motherboards1.html”
>>     <http://www.directron.com/motherboards1.html%94>
>>     its-term-confidence=”0.5”>motherboard</quote></p>
>>
>>     I believe its-info-term-ref should actually be its-term-info-ref?!
>>
>>
>>     thanks for spotting that, we'll fix it.
>>
>>
>>     This should be fixed now at
>>
>>     http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-terms-selector-4
>>
>>
>>
>>
>>
>>
>>         3) Also, just a comment (our systems won't be affected,
>>         but...) - why do you want to restrict the values to be from 0
>>         to 1? In statistics it is quite common to use also LOG-scale
>>         probabilities (because of otherwise small numbers in some
>>         cases). Is it necessary to restrict users to a 0 to 1
>>         interval? I would suggest leaving the decision up to the
>>         user's. Also - the tools will have to be identified anyway.
>>         This means that the users will be able to identify (if
>>         needed) from the systems how to parse (understand) the
>>         confidence scores. This is a general question that applies to
>>         other confidence scores as well.
>>
>>
>>         In general, we do not attach inter-tool significance to the
>>         confidence scores, hence the requirement to specify the tool
>>         using its-tools- ref. Normalising the score 0-1 is therefore
>>         not intended to support inter-tool comparisons, but more give
>>         the the presenting software a stable range/value to display.
>>
>>
>>     On that note, I'd suggest explicitly adding a sentence that the
>>     scores are comparable only in the context of the same tool. It
>>     might be obvious to us, but it's an important point.
>>
>>
>>
>>     I tried to add that to the 1st paragraph at
>>     http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation
>>
>>     for terminology, mt confidence and disambiguation.
>>
>>     Best,
>>
>>     Felix
>>
>>
>>
>>     -- Tadej
>>
>>
>>
>>
>>     For Mt confidence score the concerned implementers suggested
>>     0..1, the use of log-scale didn't come up. So for no deeper
>>     reason that consistency i'd then suggest we keep the same for
>>     term and disambig confidence scores, unless there is a pressing
>>     reason to do otherwise.
>>
>>     cheers,
>>     Dave
>>
>>
>>     4) I agree that in the current proposal it would not be
>>     reasonable to add a confidence score as in multiple domain
>>     scenario it would be misleading/wrong and it would require a
>>     different solution (For instance, similar to how domains can be
>>     marked).
>>
>>
>>
>>     Best regards,
>>     Mārcis ;o)
>>
>>     ________________________________________
>>     No: Dave Lewis [dave.lewis@cs.tcd.ie <mailto:dave.lewis@cs.tcd.ie>]
>>     Nosūtīts: otrdiena, 2012. gada 13. novembrī 19:30
>>     Kam: Multilingual Web LT Public List
>>     Tēma: Fwd: [action 265] data category specific confidence scores
>>
>>     Hi all,
>>     To try and wrap up this point:
>>
>>     Summary  of Discussion so far:
>>     1) text analytics annotation was proposed as a way of offering a
>>     confidence score for text analytics results. As with mtconfidence
>>     score, the tools annotaiton is now covered by the itsTool
>>     feature, but the proposal for confidence scores remains
>>
>>     2) Marcis pointed out, using real world terminology use cases,
>>     that we may have several annotations operating on the same
>>     fragment, so applying a confidence score to different text
>>     analytics annotations with a single data category won't work in
>>     these cases because of complete override.
>>
>>     Also, if we used text analytics annotation with annotation from
>>     other data categories we are breaking our 'no dependencies
>>     between data category rules'.
>>
>>     3) We could overcome the complete override problems using
>>     standoff mark up as in loc quality issue and provenance. But as
>>     confidence score would be different for each annotated fragment,
>>     that would result in very big stand-off records, and we would
>>     still be breaking the data cat dependencies rule. So this doesn't
>>     seem a realistic option
>>
>>     4) so the suggestion discussed in Lyon was to drop  text
>>     analytics annotation altogether as a separate data category and
>>     focus on adding confidence attributes to the existing data
>>     categories that would benefit from it.
>>
>>     so.....
>>
>>     Proposal:
>>     I therefore suggest the following and we need your feedback by
>>     friday 16th Nov so we can wrap this up on the monday call!
>>
>>     For those extended with confidence score (terminology,
>>     disambiguation) please express your support and any comments by
>>     friday - if we don't receive any we will definitely drop these
>>     suggestions. Marcis, Tadej in particular, please consider review
>>     these.
>>
>>     For exclusions (domain, localizationQualityissue), this is your
>>     last chance to counter-argue in favour of including, otherwise
>>     assume these are dropped also.
>>
>>     i) confidence for terminology: as suggested by Marcis
>>     (http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Nov/0028.html),
>>     revised data category as word revisions attached (addition to
>>     local definition, note on its-tools and example  38)
>>
>>     ii) confidence for disambiguation: revised data category as word
>>     revisions attached (addition to local definition, note on
>>     its-tools and ex 52)
>>
>>     iii) domain: I suggest excluding this as an annotation to which
>>     we attach a confidence score. Its not clear that the use of text
>>     analytics to identify domain, while feasible, actually represents
>>     a real use case for interoperability mark-up. If use it would
>>     probably be internalized by the MT engine. Also, since there are
>>     multiple domain values the semantics of a single confidence score
>>     is unclear.
>>
>>     iv) localizationQualityIssue: i suggest also excluding this as an
>>     annotation to which we attach confidence scores. The use of
>>     statistical text analytics doesn't seem common for QA tasks. One
>>     exception is the recent innovation by digital lingusitics whose
>>     Review Sentinel product ranks translation but a TA assessment for
>>     QA purposes - but this innovative and not current practice, so
>>     its probably not yet a concrete use case.
>>
>>     cheers,
>>     Dave
>>
>>
>>
>>
>>
>>
>
Received on Wednesday, 21 November 2012 10:05:09 UTC