[EmotionML] implementation release and feedbacks

Hello all,
I'm happy to announce that we released the very first version of our
EmotionML Java implementation. It is hosted on google code and released
under the MIT license: https://code.google.com/p/loria-synalp-emotionml/
It is still considered as an alpha version, we would need some users to
validate its use. And there is still some work on the documentation but the
core of the code is there.

If we could be listed as an implementation in the next round of the
implementation report it would be nice. Here is the description:

Alexandre Denis, LORIA laboratory, SYNALP team, France
The LORIA/SYNALP implementation of EmotionML is a Java standalone library
developed in the context of the ITEA Empathic Products project by the
LORIA/SYNALP team. It enables to import Java objects from EmotionML XML
files and export them to EmotionML as well. It guarantees standard
compliance by performing a two steps validation after all export operations
and before all import operations: first the EmotionML schema is tested,
then all EmotionML assertions are tested. If one or the other fails, an
error message is produced and the document cannot be imported or exported.
The library contains a corpus of badly formatted EmotionML files that
enables to double check if both the schema and the assertions manage to
correctly invalidate them. The API is hosted on google code (
https://code.google.com/p/loria-synalp-emotionml/) and is released under
the MIT License.

Moreover I don't come to you with empty hands, and I have a bunch of
remarks related to the EmotionML specification. Sorry to give you more work!
best regards,
Alexandre Denis

*** Comments about EmotionML specification

In what follows:
- "specification" refers to the document at
http://www.w3.org/TR/2013/PR-emotionml-20130416/ (version of 16 April 2013)
- "assertions" refers to the list of assertions at
- "schema" refers to the schemas
http://www.w3.org/TR/emotionml/emotionml.xsd and

** Specification clarification questions
- About relative and absolute timing ?
- Is that possible to mix relative and absolute timing ? Intuitively this
would seem weird but nothing in the
specification prevents it.

- About consistency of start/end/duration ?
- I think the specification does not enforce the consistency of start, end
and duration which are
possible alltogether. Hence it is possible to have inconsistent triplets
(start=0, end=5, duration=10).
- About text nodes ?
- the emotion element can have text nodes children, it is not specified how
many. Is it possible to intersperse text nodes all over
an emotion element ? The fact that an emotion element can have text
children is not specified in its children list.
- About emotion children combinations ?
- the specification states "There are no constraints on the combinations of
children that are allowed.", it is maybe confusing since
an emotion cannot contain two categories that belong to different
category-sets or two categories with the same name.
- About default values ?
- some attributes have default values (reference role, time ref anchor
point, duration, etc.), is it desirable to have a default
value also for other attributes, especially for the "value" attribute ? For
instance, how would you compare <category name="surprise"/>
and <category name="surprise" value="1.0"/> ? Are they semantically
equivalent ? A similar question could be made about the "confidence"
attribute, how would you compare <category name="surprise"/> and <category
name="surprise" confidence="1.0"/> ?
- About the number of <trace> ?
- the specification does not state clearly if it is possible to have
several <trace> elements inside a descriptor, it is stated
"a <trace> element". Maybe it should be stated "If present the following
child element can occur one or more time: <trace>".
The schema allows that. If this comment is accepted, the assertions 215,
224, 235, 245 should also be clarified.
- About conformance ?
- In section 4.3, it is stated "It is the responsibility of an EmotionML
processor to verify that the use of descriptor names and values
is consistent with the vocabulary definition", which is true but incomplete
with regards to the assertions,
maybe it would be beneficial to specify all the assertions that are not
under the schema responsability but rather the EmotionML processor
(see below) or at least warn that there are many assertions not checked by
the schema.

** Discrepancies between schema/assertions/specification
- Assertions not tested by the schema
- I found that the following assertions are not tested by the schema : 114,
117, 120, 123, 161, 164, 167, 170, 172, 210, 212,
216, 220, 222, 224, 230, 232, 236, 240, 242, 246, 410, 417.
 There are assertions that are impossible to test with a XSD schema I think:
114, 117, 120, 123, 161, 164, 167, 170 : vocabulary set id and type checking
212, 222, 232, 242 : vocabulary name membership
417 : media type (unless enumerating them)
 Some may be possible with some tweaking:
210, 220, 230, 240 : vocabulary set presence
216, 224, 236, 246 : <trace> and "value"
 There are two "true" errors I think:
172 : The "version" attribute of <emotion>, if present, MUST have the
 value "1.0"
I think it should not be "optional with default value 1.0" but rather
"optional with fixed value 1.0"
410 : The <reference> element MUST contain a "uri" attribute
the "uri" attribute is optional by default in the schema

- 2.4.1, "The end value MUST be greater than or equal to the start value",
- the schema does not check it and there is no assertion enforcing it

- 2.1.2, "a typical use case is expected to be embedding an <emotion> into
some other markup",
- there is no assertion that describe that <emotion> may be embedded in
another markup, does it imply we could embed other elements ?
- is a document containing a sole <emotion> a valid document (not in the
sense of <emotionml> document) ? If yes, maybe an assertion clarifiying the
use of <emotion> would be useful.

- assertions 105, 155, 601, 606, status "Req=N"
- the assertions mix the presence of <info> and the number of <info>
elements, while the presence is not restricted, the number
MUST be 0 or 1, hence the required status wrt this part of assertions
should be "Req=Y"

- 2.1.2, "There are no constraints on the order in which children occur"
- the schema does actually restrict the order of elements, <info> needs to
be first, then the descriptors, then the references

** Invalid documents
(I have not systematically tested examples with non-valid vocabulary URIs
such as http://www.example....)

- http://www.w3.org/TR/emotion-voc/xml does not comply with assertion 110
(hence all examples that refer to vocabularies there also fail)

- 2.3.3 The <info> element
- The last example of this section does not comply with assertion 212 since
the name "neutral" does not belong to every-day categories

- 5.1.1 Annotation of Text, "Annotation of text" Lewis Caroll example:
- In the <meta:doc> element, the character & is found, which does not pass
XML validation, it should be &amp; (so does the example below)
- It also does not comply with assertion 212 since Disgust and Anger are
not part of every-day categories

Received on Thursday, 2 May 2013 09:43:51 UTC