Spoken language, multimedia and metadata

Dear collegues,

first of all, excuse my naive request to this forum that I am not too familiar with.

In a reply therefore also include ne in the reply so that I am getting it for sure.

I am working on (automatic) indexing of spoken language for information access. One of the

observations we made is that there is a lot of metadata that exists for spoken data or

multimedia content. In my thesis that I am writing right now I am also make strong points that

multimedia content can replace many written ressources as long as indexing is available.

Of course the same indices as in written language hold, those are mostly semantic information and

information that relates documents to each other. On top of that the following metadata might be


* speaker, location, time

* recording equipment, audio/video/slides/monitor capture available

* privacy/access constraints

* emotional status

* segmentation information

* speaking style

* register (meeting, lecture, speech)

On the utterance level we might also be able to attach information like stress, certainty,

attention status, posture, ..........

It'd be great to hear from some of the wizards of RDF here whether it is interesting to integrate this kind

of information. I realize -- having worked enough with object oriented logics in the past -- that this

is a different type of information if we just talk about an ontology, however a more general logic

would be able to handle it.

Thanks a lot,


Klaus Ries

4629 Newell Simon Hall
Carnegie Mellon University
5000 Forbes Ave
Pittsburgh, PA 15213-3890

phone (412) 268-6594
fax   (412) 268-6298


15 Forbes Terrace
Pittsburgh, PA, 15217-1413

phone: (412) 422-2218

Received on Monday, 18 September 2000 16:38:33 UTC