Re: [EMOXG] XML suggestion for Meta 1: Confidence from Bill Jarrold on 2008-08-06 (public-xg-emotion@w3.org from August 2008)

From: Bill Jarrold <jarrold@AI.SRI.COM>
Date: Wed, 6 Aug 2008 08:23:24 -0700
To: "Breen, Andrew" <Andrew.Breen@nuance.com>
Cc: "Marc Schroeder" <schroed@dfki.de>, "EMOXG-public" <public-xg-emotion@w3.org>
Message-Id: <B8396E73-6B22-448F-962F-B769A3A518D8@ai.sri.com>
On Aug 6, 2008, at 1:58 AM, Breen, Andrew wrote:

>
> Hi,
> I think it is asking a lot of a developer to have to define so many  
> confidence levels, I think we need to make these optional.

Yes, I strongly agree that assertions of confidence should be optional.

> With the assumption that if no confidence level is indicated it is  
> assumed to be fully confident.

I think that no assertion of confidence should mean that the system  
is agnostic (i.e. has no knowledge of) confidence.  We are putting  
words in the rater's mouth if we interpret there non-assertion of  
confidence as implying "fully confident."

>
> I'm also unsure about the practical use of applying a confidence to  
> a dimension if the emotion is marked as low confidence.  Is it  
> likely that if an emotion is given low confidence that a high  
> confidence on a dimension will have any meaning?

I think a confidence measure can be very useful.  Suppose, for  
example you are trying to train a machine learning algorithm to  
recognize a given emotion state.  You would want to give greater  
weight to training examples with higher confidence than those with  
lower confidence.

>
> I'm against using terms such as "very much", "very negative" in  
> confidence measures, as this leaves far too much room for personal  
> interpretation. Using numbers provides some security against this  
> e.g. there can be much less personal interpretation of the meaning  
> of a value 0.5 in a defined range of 0 to 1.

I'm not sure why you think that a number like .5 has less room for  
personal  interpretation than e.g ."moderately confident."  But I do  
think that ratings have the added plus that they allow for  
arbitrarily fine gradations in confidence measures.  Plus,  
intuitively numbers to seem more scientific than words which are  
fuzzy and ambiguous.  However, there are problems with numeric  
confidence measures.

In the early days of Cyc (see www.cyc.com) they were thinking they  
wanted to associate a numeric confidence measure with each assertion  
in the knowledge base.  The number would be in the range of 0 to 1.   
They found that this had the undesirable side effect of introducing  
unwanted orderings in assertion certainty.  By illustration....

Suppose on Day 1 you make assertion 323 and give it a confidence of  
0.75.
Many days pass and you make a whole bunch more assertions to your  
knowledge base.
On Day 300 your colleague makes assertion 15678 and gives it a  
confidence of 0.74.
Then 5 minutes later your colleague makes another assertion 15679 and  
think it is somewhat greater in certainty than the previous  
assertion.  So you decide to give it a confidence of 0.77.

The problem is that you have inadvertently asserted that assertion  
15679 has a greater certainty than assertion 323!

(Lenat and Guha "Building Large Knowledge Based Systems" probably  
does a better job at explaining this.)

A full solution to this problem is to not assert confidence values  
but rather assert orderings between rating certainty.  The result is  
that you'd have a large partially ordered graph of ratings, ordered  
according to how certain they were.

Syntactically asserting this would require making assertions about  
assertions.  Are we allowed to make assertions about assertions in  
XML?  Even if not, there may be ways around this problem which I can  
explain later.

Another (partial) solution is to have coarse buckets of certainty  
factors -- e.g. very little, moderately little, moderate, much, very  
much.  As in a 5 item likert rating.  I am sure that there is lots of  
work been done on dealing with these kinds of scales statistically  
(as well as dealing with their unwanted side effects (e.g. one side  
effect is that there is still the potential for the unwanted ordering  
problem -- but it seems less severe).

One more point, human raters being human have different  
characterilogical confidence (as well as differing levels of  
expertise which can also affect confidence).  Thus, I might suggest  
that we allow for the option of associating the rater with each  
confidence assertion they gave.

Thanks,

Bill

>
> Cheers,
> Andy B.
>
> -----Original Message-----
> From: public-xg-emotion-request@w3.org [mailto:public-xg-emotion- 
> request@w3.org] On Behalf Of Marc Schroeder
> Sent: 05 August 2008 16:14
> To: EMOXG-public
> Subject: [EMOXG] XML suggestion for Meta 1: Confidence
>
>
> Hi,
>
> in fullfillment of ACTION-28
> (http://www.w3.org/2005/Incubator/emotion/group/tracker/actions/28), I
> make a suggestion for
>
> Meta 1: Confidence / probability
> ================================
>
> http://www.w3.org/2005/Incubator/emotion/XGR-requirements/#Confidence
> states:
> "The emotion markup must provide a representation of the degree of
> confidence or probability that a certain element of the representation
> is correct. It must be possible to indicate the confidence for each
> element of the representation separately: e.g., the confidence that  
> the
> category is indeed X is independent from the confidence that its
> intensity or its timing is correctly indicated."
>
> I am going to use "confidence" throughout, somehow I like it better  
> than
> "probability", but I have no strong feelings about it either.
>
> I see confidence as another example of a (unipolar) scale value. It  
> can
> be realised as a simple attribute that can be applied to most elements
> in the annotation, at least to:
>
>      * Core 2. Emotion categories
> 	-- confidence that the given category is correct
>      * Core 3. Emotion dimensions
> 	-- confidence that an individual dimension is correctly set, and/or
> that all dimensions together are correctly stated
>      * Core 4. Appraisals related to the emotion
> 	-- same as for Core 3.
>      * Core 5. Action tendencies
> 	-- same as for Core 3.
>      * Core 7. Emotion intensity
> 	-- confidence that the emotion has the intensity as stated.
>
>
> I am not sure about the following:
>      * Core 1. Type of emotion-related phenomenon
> 	-- confidence that the phenomenon has the type indicated?
>      * Core 6. Multiple and/or complex emotions
> 	-- confidence that there are multiple emotions?
>      * Core 8. Emotion timing
> 	-- confidence that the timing is as stated?
>      * Meta 2. Modality
> 	-- confidence that the emotion is expressed through the given  
> modality?
>
> I believe the following information should not have a confidence  
> associated:
>      * Links 1. Links to media
>      * Links 2. Position on a time line in externally linked objects
>      * Links 3. The semantics of links to the "rest of the world"
>      * Global 0. A generic mechanism to represent global metadata
>
> It doesn't seem to make sense to say this is "probably" the media file
> that I am annotating, etc.
>
>
> Examples
> ========
>
>      * Core 2. Emotion categories
>
> <category set="myset" name="surprise" confidence="very much"/>
>
> A simple example indicating, using a verbal scale value, that the
> confidence is very high that surprise is the emotion to annotate.
>
>      * Core 3. Emotion dimensions
>
>       <dimensions set="Arousal-and-Valence">
>           <arousal value="very much" confidence="0.9"/>
>           <valence value="slightly positive" confidence="0.3"/>
>       </dimensions>
>
> An example using continuous scale values for confidence to indicate  
> that
>   the annotation of high arousal is probably correct, but the  
> annotation
> of slightly positive valence may or may not be correct. Note that the
> choice of verbal vs. numeric scales between the emotion dimension and
> the confidence is totally independent, i.e. it is fully possible to  
> use
> verbally specified emotion dimensions with numerically specified
> confidences (as in this example) or any other combination of verbal  
> and
> numeric scales.
>
>      * Core 4. Appraisals related to the emotion
>
>       <appraisals set="Scherer">
>           <novelty value="as much as possible" confidence="medium"/>
>           <intrinsic-pleasantness value="very negative" confidence="as
> much as possible"/>
>           <goal-conduciveness value="not at all" confidence="much"/>
>       </appraisals>
>
> An example of appraisals using verbal scales for both the appraisal
> dimensions themselves and for the confidence. Note that the confidence
> is always unipolar, but that some of the appraisal dimensions are  
> bipolar.
>
>
>      * Core 5. Action tendencies
>
>       <action-tendencies set="Frijda" confidence="0.8">
>           <approach activation="0.9"/>
>           <avoidance activation="0.0"/>
>           <being-with activation="0.9"/>
>       </action-tendencies>
>
> The example shows confidence as an attribute of the entire group of
> action tendencies; the confidence indicated (rather high) therefore
> applies to all action tendencies contained.
>
>      * Core 7. Emotion intensity
>
>       <intensity value="0.1" confidence="0.8"/>
>
> A high confidence that the emotion has a low intensity.
>
>
> Combinations of the above
> -------------------------
>
> Obviously an emotional annotation can be a combination of some or  
> all of
> the above, as in the following examples.
>
> <emotion>
>      <intensity value="0.1" confidence="0.8"/>
>      <category set="everyday" name="boredom" confidence="0.1"/>
> </emotion>
>
> The intensity of the emotion is quite probably low, but if we have to
> guess, we would say it's boredom.
>
> <emotion>
>      <intensity value="0.1" confidence="0.8"/>
>      <category set="everyday" name="boredom" confidence="0.1"/>
>      <dimensions set="Arousal-and-Valence">
>          <arousal value="0.4" confidence="0.8"/>
>          <valence value="-0.1" confidence="0.5"/>
>      </dimensions>
> </emotion>
>
> In addition, we state that the arousal is slightly below medium (which
> would be 0.5), and we are half-way confident that valence is slightly
> negative.
>
> <emotion>
>      <intensity value="0.1" confidence="0.8"/>
>      <category set="everyday" name="boredom" confidence="0.1"/>
>      <dimensions set="Arousal-and-Valence">
>          <arousal value="0.4" confidence="0.8"/>
>          <valence value="-0.1" confidence="0.5"/>
>      </dimensions>
>      <appraisals set="Scherer">
>          <novelty value="0.0" confidence="1.0"/>
>      </appraisals>
> </emotion>
>
> In addition, we are absolutely certain that no new thing has currently
> happened.
>
> <emotion>
>      <intensity value="0.1" confidence="0.8"/>
>      <category set="everyday" name="boredom" confidence="0.1"/>
>      <dimensions set="Arousal-and-Valence">
>          <arousal value="0.4" confidence="0.8"/>
>          <valence value="-0.1" confidence="0.5"/>
>      </dimensions>
>      <appraisals set="Scherer">
>          <novelty value="0.0" confidence="1.0"/>
>      </appraisals>
>      <action-tendencies set="Frijda">
>          <approach activation="0.1" confidence="0.6"/>
>           <avoidance activation="0.1" confidence="0.4"/>
>           <being-with activation="0.1" confidence="0.5"/>
>      </action-tendencies>
> </emotion>
>
> In addition, we think but are not sure that there is only a very low
> tendency to act in any of the available ways.
>
> Looking forward to comments, best regards,
> Marc
>
>
> Marc Schroeder schrieb:
>>
>> (I am sending this directly to the public list so that people have a
>> chance to see this; it has not yet been discussed in the small group.
>> The idea is to get initial ideas about all requirements up on the  
>> table
>> quickly, and then to go through them over the next weeks and  
>> months, by
>> emails in the small group and in phone meetings)
>>
>>
>> This is a discussion and suggestion for possible realisations of the
>> EmotionML requirements [1] Core 3, Core 4, Core 5, and Core 7, which
>> have in common that they rely on scale values.
>>
>> This is in response to the action item [2] agreed during the last  
>> phone
>> conference.
>>
>> As agreed, the syntax is inspired by the provisional consensus  
>> example
>> for Core 2 (Emotion Category):
>>
>> <emotion>
>>     <category set="everyday" name="pleasure" confidence "0.9"/>
>> </emotion>
>>
>>
>> Generic proposal regarding scale values
>> ---------------------------------------
>>
>> The issue of how to describe scale values was already discussed to  
>> some
>> extent in an email thread initiated by Bill [3]. Attempting a  
>> summary of
>> the discussion, it would appear that:
>>
>> * scales are either unipolar (from "not" to "a lot") or bipolar (from
>> "very negative" via "neutral" to "very positive");
>> * some use cases (reasoning, generation) usually describe the  
>> position
>> on a scale using continuous values;
>> * other use cases (manual labelling) usually use discrete, ordinal
>> values to describe the position on a scale;
>> * there are psychological reasons why it is not valid to map ordinal
>> values onto a numerical scale;
>> * however, interoperability considerations will sometimes *require* a
>> mapping between ordinal and numerical scales;
>> * for numerical scales, interoperability considerations push  
>> towards a
>> pre-defined range such as [0,1] or [-1,1];
>> * exaggerations (e.g., cartoon-like expressions in generation) may  
>> push
>> towards values beyond the limits of that range.
>>
>> The following issues were also introduced in the discussion but  
>> seem not
>> to find consensus support:
>> - qualifications of scale values relative to a person ("a low  
>> amount of
>> anger for a New Yorker")
>> - allowing for units ("3 felicitons") that may possibly be defined in
>> the future;
>> - flexibility of numerical ranges in view of user-specific needs (was
>> contradicted on the basis of interoperability).
>>
>>
>> Based on these constraints it seems reasonable to propose:
>>
>> a) numerical scales with a pre-defined range ([0,1] for unipolar,  
>> [-1,1]
>> for bipolar scales) which, however, should sometimes not be strictly
>> enforced;
>>
>> b) a pre-defined set of discrete values with ordinal ordering,  
>> e.g. as
>> seven points:
>>
>>   i) for unipolar scales:
>>           not at all
>>           very little
>>           little
>>           medium
>>           much
>>           very much
>>           as much as possible
>>
>>   ii) for bipolar scales:
>>           very negative
>>           negative
>>           slightly negative
>>           neutral
>>           slightly positive
>>           positive
>>           very positive
>>
>> Note that I am not attached to the number nor the names of values; I
>> have chosen them ad hoc -- if someone has a well-founded alternative,
>> please bring it forward.
>>
>> Users would be free to use only some of these values if they need  
>> less
>> than seven ordinal points. A mapping may be introduced in the future
>> with the currently optional requirement Onto 1 (Mapping...). For the
>> moment, users who need a mapping would have to map from ordinal to
>> numerical values using the method of their choice.
>>
>>
>> Concretely, I suggest to realise scales as attribute-value pairs. An
>> attribute should be specific about being either a unipolar or a  
>> bipolar
>> scale. Unipolar scales can hold values that are either a floating  
>> point
>> number from 0 to 1, or one of the "unipolar" strings listed above,  
>> e.g.
>>
>> <myElement myUnipolarScale="0.234"/>
>> <myElement myUnipolarScale="very little"/>
>>
>> Similarly, a bipolar scale could hold values that are either a  
>> floating
>> point number from -1 to 1, or one of the "bipolar" strings listed  
>> above,
>> e.g.:
>> <myElement myBipolarScale="-0.1"/>
>> <myElement myBipolarScale="slightly negative"/>
>>
>>
>> Working on this basis, the following proposals for Core 3, 4, 5,  
>> and 7
>> become rather simple.
>>
>>
>>
>> Core 3: Emotion dimensions
>> --------------------------
>>
>> citing [1]: "... In emotion psychology, a small number of 2-4 emotion
>> dimensions is considered to cover the most essential aspects of  
>> people's
>> emotion concepts and subjective experience. A dimension is a  
>> unipolar or
>> bipolar continuous scale.
>> As for emotion categories, it is not possible to predefine a  
>> normative
>> set of dimensions. Instead, the language should provide a  
>> "default" set
>> of dimensions, that can be used if there are no specific application
>> constraints, but allow the user to "plug in" a custom set of  
>> dimensions
>> if needed."
>>
>>
>> A possible syntax similar to the category example could look as  
>> follows:
>>
>> <emotion>
>>     <dimensions set="FontaineSchererRoeschEllsworth"
>>                 valence="(bipolar-scale)"
>>                 potency="(unipolar-scale)"
>>                 arousal="(unipolar-scale)"
>>                 unpredictability="(unipolar-scale)" />
>> </emotion>
>>
>> Here, the value of the "set" attribute would determine the names  
>> of the
>> attributes that can occur.
>>
>> Examples:
>>
>> <emotion>
>>     <category set="everyday" name="excited"/>
>>     <dimensions set="Arousal-and-Valence"
>>                 arousal="0.9"
>>                 valence="0.2"/>
>> </emotion>
>>
>>
>> Or using verbal scale values:
>>
>> <emotion>
>>     <category set="everyday name="excited"/>
>>     <dimensions set="Arousal-and-Valence"
>>                 arousal="very much"
>>                 valence="slightly positive"/>
>> </emotion>
>>
>>
>> This approach groups all dimensions into a single element, which  
>> means
>> that meta-annotation such as confidence (Meta 1) can only be  
>> applied to
>> all dimensions at once, as in:
>>
>> <emotion>
>>     <dimensions set="Arousal-and-Valence"
>>                 arousal="very much"
>>                 valence="slightly positive"
>>                 confidence="0.5"/>
>> </emotion>
>>
>> In other words, with this method we can not express that we are  
>> sure the
>> guy is very aroused but we are unsure about his valence. If
>> meta-information should be annotated on each dimension separately,  
>> the
>> following more explicit structure would be more appropriate:
>>
>> <emotion>
>>     <dimensions set="Arousal-and-Valence">
>>         <arousal value="very much" confidence="0.9"/>
>>         <valence value="slightly positive" confidence="0.3"/>
>>     </dimensions>
>> </emotion>
>>
>>
>> Core 4: Appraisals
>> ------------------
>>
>> citing [1]: "... . Appraisal is a core concept in cognitive emotion
>> psychology; cognitive emotion theories describe in detail which
>> appraisals of "things in the world" lead to which emotions.
>> Syntactically, appraisals may be represented as unipolar or bipolar
>> scales."
>>
>>
>> The proposed solution is exactly the same as for Core 3, i.e.:
>>
>> <emotion>
>>     <appraisals set="Scherer"
>>                 novelty="(unipolar-scale)"
>>                 intrinsic-pleasantness="(bipolar-scale)"
>>                 ...
>>                 goal-conduciveness="(unipolar-scale)"/>
>> </emotion>
>>
>> Or else, to allow for individual meta-annotation:
>>
>> <emotion>
>>     <appraisals set="Scherer">
>>         <novelty value="(unipolar-scale)"/>
>>         <intrinsic-pleasantness value="(bipolar-scale)"/>
>>                 ...
>>         <goal-conduciveness value="(unipolar-scale)"/>
>>      </appraisals>
>> </emotion>
>>
>>
>> Core 5: Action tendencies
>> -------------------------
>>
>> citing [1]: "The emotion markup must provide a possibility to
>> characterise emotions in terms of the action tendencies linked to  
>> them.
>> For example (Frijda, 1986, p. 88, Table 2.1), desire is linked to a
>> tendency to approach, fear is linked to a tendency to avoid, etc.
>> Activation, as defined by Frijda (1986, pp. 90-94), is the  
>> readiness to
>> act according to a specific action tendency. It is a degree, and  
>> should
>> be represented by a scale value."
>>
>> Again, the same approach can be proposed:
>>
>> <emotion>
>>     <action-tendencies set="Frijda"
>>         approach="(unipolar scale)"
>>         avoidance="(unipolar scale)"
>>         being-with="(unipolar scale)"
>>         ...
>>         />
>> </emotion>
>>
>> Or with more explicit structure, e.g.:
>>
>> <emotion>
>>     <action-tendencies set="Frijda">
>>         <approach activation="(unipolar scale)"/>
>>         <avoidance activation="(unipolar scale)"/>
>>         <being-with activation="(unipolar scale)"/>
>>         ...
>>     </action-tendencies>
>> </emotion>
>>
>>
>> Core 7: Emotion intensity
>> -------------------------
>>
>> citing [1]: "The emotion markup must provide an emotion attribute to
>> represent the intensity of an emotion. The intensity is a unipolar  
>> scale."
>>
>> A typical use of intensity is in combination with a category.  
>> However,
>> in some emotion models, the emotion's intensity can also be used in
>> combination with a position in emotion dimension space. Therefore,
>> intensity must be specified independently of category. One possible
>> solution is this:
>>
>> <emotion>
>>     <intensity value="(unipolar scale)"/>
>> </emotion>
>>
>> Making intensity an explicit element makes it possible to add
>> meta-information, which would not be possible if intensity was an
>> attribute, e.g. of the <emotion> tag itself.
>>
>> For example, expressing a high confidence that the intensity is  
>> low, but
>> only a vague idea what kind of emotion it may be:
>>
>> <emotion>
>>     <intensity value="0.1" confidence="0.8"/>
>>     <category set="everyday" name="boredom" confidence="0.1"/>
>> </emotion>
>>
>>
>> [1] http://www.w3.org/2005/Incubator/emotion/XGR-requirements/
>> [2] http://www.w3.org/2008/07/03-emotion-minutes.html#action06
>> [3] http://lists.w3.org/Archives/Public/public-xg-emotion/2008May/ 
>> 0005.html
>
> -- 
> Dr. Marc Schröder, Senior Researcher at DFKI GmbH
> Coordinator EU FP7 Project SEMAINE http://www.semaine-project.eu
> Chair W3C Emotion ML Incubator http://www.w3.org/2005/Incubator/ 
> emotion
> Portal Editor http://emotion-research.net
> Team Leader DFKI Speech Group http://mary.dfki.de
> Project Leader DFG project PAVOQUE http://mary.dfki.de/pavoque
>
> Homepage: http://www.dfki.de/~schroed
> Email: schroed@dfki.de
> Phone: +49-681-302-5303
> Postal address: DFKI GmbH, Campus D3_2, Stuhlsatzenhausweg 3, D-66123
> Saarbrücken, Germany
> --
> Official DFKI coordinates:
> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
> Geschaeftsfuehrung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
> Dr. Walter Olthoff
> Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
> Amtsgericht Kaiserslautern, HRB 2313
>
Received on Wednesday, 6 August 2008 17:00:06 UTC