Re: [MMSEM] intersting project automatic multimodal annotation

Hi Giovanni,

> Greetings all, i have just been pointed at
> which i didnt know and has an interesting online demo

Thanks for the pointer.
It is true that IBM alphaworks made nice prototypes. I have played few years ago with their
At the time, it was the first one to generate MPEG-7 descriptions ...

>  From there one fins LSCOM, an "expanded multimedia concept lexicon on
> the order of 1000. Concepts related to events, objects, locations,
> people, and programs have been selected following a multi-step process
> involving input solicitation, expert critiquing, comparison with related
> ontologies, and performance evaluation."
> Peeking inside the "ontology" one finds approximately 850 concepts that
> have been extrapolated and a list of annotations with such terms for
> specific videos segments (provided as a training set for classifiers, i
> thinki)

Well, LSCOM is definitively not an ontology, and I will not use the word "concepts" neither.
LSCOM is a set of 900 terms, that have been mainly produced for the TRECVid challenges. It contains
more or less what the TRECVid guys should automatically recognize in their videos each year. There
is no real structuration of these terms, and you will find very different things ...

> Terms might be as generic as "male" "statue" "resturant" but they get
> suspiciously specific at times , with terms such as "Saddam Hussein"
> "Steel Mill worker" "Tennis" "Abused Woman" "Abused Child" (but no
> "Abused_man" for example)

Exactly. These terms match what you have to find in some CNN news during TRECVid :-)


RaphaŽl Troncy
CWI (Centre for Mathematics and Computer Science),
Kruislaan 413, 1098 SJ Amsterdam, The Netherlands
e-mail: &
Tel: +31 (0)20 - 592 4093
Fax: +31 (0)20 - 592 4312

Received on Thursday, 14 September 2006 12:37:31 UTC