RE: Question on toolsRef for disambigConfidence and termConfidence (Re: Atb.: [action 265] data category specific confidence scores)

Hi Felix,

I assume (following the definition) that the examples (40 and 54) are not correct. It probably has happened because the term confidence and also the disambiguation examples have been created before agreeing on the toolsRef attribute.

Best regards,
Mārcis ;o)

From: Felix Sasaki [mailto:fsasaki@w3.org]
Sent: Wednesday, November 21, 2012 8:03 AM
To: public-multilingualweb-lt@w3.org; dave lewis; tadej.stajner@ijs.si; Mārcis Pinnis
Subject: Question on toolsRef for disambigConfidence and termConfidence (Re: Atb.: [action 265] data category specific confidence scores)

Hi all again,

looking into the "confidence score" attributes again, I saw these three paragraphs:

" Any node selected by the MT Confidence<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#mtconfidence> data category MUST<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#rfc2119> be contained in an element with the toolsRef (or in HTML5, its-tools-ref) attribute specified for the MT Confidence<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#mtconfidence> data category. For more information, see Section 5.8: ITS Tools Annotation<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation>."

" Any node selected by the terminology data category with the termConfidence attribute specified MUST<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#rfc2119> be contained in an element with the toolsRef (or in HTML5 its-tools-ref) attribute specified for the Terminology<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#terminology> data category. See Section 5.8: ITS Tools Annotation<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation> for more information."

" Any node selected by the disambiguation<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#Disambiguation> data category with the disambigConfidence attribute specified MUST<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#rfc2119> be contained in an element with the toolsRef (or in HTML5 its-tools-ref) attribute specified for the disambiguation<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#Disambiguation> data category. For more information, see Section 5.8: ITS Tools Annotation<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation>."

However, the examples for  termConfidence (ex. 40) and disambigConfidence (ex. 54) show no toolsRef (or its-tools-ref) attribute. Are the examples wrong or is toolsRef / its-tools-ref optional for termConfidence and disambigConfidence?

Thanks,

Felix

Am 20.11.12 22:07, schrieb Felix Sasaki:
Hi Dave, Marcis, Tadej, all,

Am 14.11.12 19:27, schrieb Tadej Stajner:

Hi, Dave, Marcis,
(see below)

On 11/14/2012 5:47 PM, Dave Lewis wrote:

thanks for the feedback, comment inline.

On 13/11/2012 19:56, Mārcis Pinnis wrote:

Hi Dave,

1) I support your suggestion as drafted in the attachment.
2) Although I believe there is a typing mistake:

<p>And he said: you need a new <quote its:term="yes" its-info-term-ref=”http://www.directron.com/motherboards1.html” its-term-confidence=”0.5”>motherboard</quote></p>

I believe its-info-term-ref should actually be its-term-info-ref?!

thanks for spotting that, we'll fix it.

This should be fixed now at

http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-terms-selector-4





3) Also, just a comment (our systems won't be affected, but...) - why do you want to restrict the values to be from 0 to 1? In statistics it is quite common to use also LOG-scale probabilities (because of otherwise small numbers in some cases). Is it necessary to restrict users to a 0 to 1 interval? I would suggest leaving the decision up to the user's. Also - the tools will have to be identified anyway. This means that the users will be able to identify (if needed) from the systems how to parse (understand) the confidence scores. This is a general question that applies to other confidence scores as well.

In general, we do not attach inter-tool significance to the confidence scores, hence the requirement to specify the tool using its-tools- ref. Normalising the score 0-1 is therefore not intended to support inter-tool comparisons, but more give the the presenting software a stable range/value to display.

On that note, I'd suggest explicitly adding a sentence that the scores are comparable only in the context of the same tool. It might be obvious to us, but it's an important point.


I tried to add that to the 1st paragraph at
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#its-tool-annotation
for terminology, mt confidence and disambiguation.

Best,

Felix



-- Tadej




For Mt confidence score the concerned implementers suggested 0..1, the use of log-scale didn't come up. So for no deeper reason that consistency i'd then suggest we keep the same for term and disambig confidence scores, unless there is a pressing reason to do otherwise.

cheers,
Dave


4) I agree that in the current proposal it would not be reasonable to add a confidence score as in multiple domain scenario it would be misleading/wrong and it would require a different solution (For instance, similar to how domains can be marked).


Best regards,
Mārcis ;o)

________________________________________
No: Dave Lewis [dave.lewis@cs.tcd.ie<mailto:dave.lewis@cs.tcd.ie>]
Nosūtīts: otrdiena, 2012. gada 13. novembrī 19:30
Kam: Multilingual Web LT Public List
Tēma: Fwd: [action 265] data category specific confidence scores

Hi all,
To try and wrap up this point:

Summary  of Discussion so far:
1) text analytics annotation was proposed as a way of offering a confidence score for text analytics results. As with mtconfidence score, the tools annotaiton is now covered by the itsTool feature, but the proposal for confidence scores remains

2) Marcis pointed out, using real world terminology use cases, that we may have several annotations operating on the same fragment, so applying a confidence score to different text analytics annotations with a single data category won't work in these cases because of complete override.

Also, if we used text analytics annotation with annotation from other data categories we are breaking our 'no dependencies between data category rules'.

3) We could overcome the complete override problems using standoff mark up as in loc quality issue and provenance. But as confidence score would be different for each annotated fragment, that would result in very big stand-off records, and we would still be breaking the data cat dependencies rule. So this doesn't seem a realistic option

4) so the suggestion discussed in Lyon was to drop  text analytics annotation altogether as a separate data category and focus on adding confidence attributes to the existing data categories that would benefit from it.

so.....

Proposal:
I therefore suggest the following and we need your feedback by friday 16th Nov so we can wrap this up on the monday call!

For those extended with confidence score (terminology, disambiguation) please express your support and any comments by friday - if we don't receive any we will definitely drop these suggestions. Marcis, Tadej in particular, please consider review these.

For exclusions (domain, localizationQualityissue), this is your last chance to counter-argue in favour of including, otherwise assume these are dropped also.

i) confidence for terminology: as suggested by Marcis (http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Nov/0028.html), revised data category as word revisions attached (addition to local definition, note on its-tools and example  38)

ii) confidence for disambiguation: revised data category as word revisions attached (addition to local definition, note on its-tools and ex 52)

iii) domain: I suggest excluding this as an annotation to which we attach a confidence score. Its not clear that the use of text analytics to identify domain, while feasible, actually represents a real use case for interoperability mark-up. If use it would probably be internalized by the MT engine. Also, since there are multiple domain values the semantics of a single confidence score is unclear.

iv) localizationQualityIssue: i suggest also excluding this as an annotation to which we attach confidence scores. The use of statistical text analytics doesn't seem common for QA tasks. One exception is the recent innovation by digital lingusitics whose Review Sentinel product ranks translation but a TA assessment for QA purposes - but this innovative and not current practice, so its probably not yet a concrete use case.

cheers,
Dave

Received on Wednesday, 21 November 2012 07:21:52 UTC