- From: <tobias.buerger@sti2.at>
- Date: Mon, 15 Sep 2008 16:18:17 +0200 (CEST)
- To: public-media-annotation@w3.org
Dear all, as some of the participants in the working group did not participate in the Multimedia Semantics XG [1] and perhaps are not aware of the reports which have been produced there, I proposed to summarize the use cases which are relevant for the work of the media annotation working group and extract some relevant requirements. So first of all the MMSEM multimedia interoperability report [2] summarizes the work of the XG in its use cases as detailed in [3] and spots interoperability issues among these use cases and demonstrates how semantic technologies can help to overcome some of these interoperability issues. The report covers the following use cases: (1) The photo use case which is covering the extraction of semantics from photos, their annotation, and cross-application compatibility of annotation and organization tools and systems (2) The music use which deals with the annotation of different aspects of music on the Web, the interoperability of different vocabularies and standards and the aggregation of related information (3) The news use case that concerns annotation of news which are mostly available on the Web as textual information illustrated by images, videos or audio files. The use case contains an explanation of different standards and vocabularies to describe news content. (4) The tagging use case which tackles the problem of interoperability and portability of tagging systems and personal tags. The use case sketches a solution based on SKOS Core [4] (5) The semantic media analysis use case which highlights challenges in media analysis and which shows how to exploit different modalities of multimedia content for analysis. (6) The algorithm representation use case which is about the interoperability of existing multimedia analysis systems in terms of descriptions of their in- / and output. Of particular relevance to the work of the media annotation working group are the photo, music, news and semantic media analysis I think. These use cases discuss the variety of content and description schemes available on the Web. For example the photo use case is motivated by the need for a common exchange format of photo annotation to enable finding, sharing and reusing photos across the borders of single sites and tools. The use case discusses pros and cons of EXIF, XMP, PhotoRDF, DIG35 and MPEG-7 for using as a lingua franca among tools and sites. The conclusion of the use case authors is, that none of the standards is perfectly suited and that a "limited and simple but at the same time comprehensive vocabulary in a machine-readable, exchangeable, but not over complicated representation is needed" [2]. The music use case discusses integration of different description schemes for audio files and the integration of further information. Discussed formats include OGG Vorbis, ID3 and the Music Ontology. The authors of the use case also present typical metadata fields which are used to describe music data. The semantic media analysis use case most notably highlights the integration of information in different modalities, i.e. relation of persons mentioned in an audio track with their depiction in a video, the relation of captions of images with objects in the image or the relation of text fragments to objects in an image which the text is illustrated by. Making this cross-modality-links possible demands for a basic interoperability between audio-, video- and/or image-related description schemes. Furthermore the use case highlights the necessity of linking low-level features to high-level semantics which is important for some retrieval scenarios. The relation of semantics across modalities may give support for reasoning mechanism to infer further high-level concepts. The report contains much more details thus please have a look at [1] and [3] to get a more detailed introduction to the use cases. Some basic requirements which I can extract from these use cases and more general observations for our common media ontology is that (1) Predominant media types (besides text) on the Web are images, video and audio files. (2) Semantics and annotations coming from authoring, organisation, and sharing tools should be preserved as much as possible (3) Descriptions should be exchangeable between sites and tools (3) Linking between different media types, resp. fragments thereof should be possible. This connects the work of the media annotation working group to the work of the fragment working group (4) Despite the fact that MPEG-7 is excluded from the focus of the group, it makes sense for some use cases to keep descriptions of low-level semantics. This can however be accomplished with mechanisms like GRDDL [5] as proposed by for example the ramm.x model [6]. Perhaps we can build upon these 4 points. Best, Tobias [1] http://www.w3.org/2005/Incubator/mmsem/ [2] http://www.w3.org/2005/Incubator/mmsem/XGR-interoperability/ [3] http://www.w3.org/2005/Incubator/mmsem/wiki/ [4] http://www.w3.org/TR/swbp-skos-core-guide [5] http://www.w3.org/2004/01/rdxh/spec [6] http://sw.joanneum.at/rammx/ -- _________________________________________________ Dipl.-Inf. Univ. Tobias Bürger STI Innsbruck University of Innsbruck, Austria http://www.sti-innsbruck.at/ tobias.buerger@sti2.at __________________________________________________
Received on Monday, 15 September 2008 14:31:31 UTC