- From: Véronique Malaisé <vmalaise@few.vu.nl>
- Date: Tue, 16 Sep 2008 12:10:26 +0100
- To: public-media-annotation@w3.org
Sorry about the late fulfillment of my task, I had a writer's blank...
and I am not really sure yet whether this document is what was expected,
but I made a try and I'm waiting for your comments! I have had a look at
the IPTC Standard - Photo Metadata 2008 [1] to draw a set of
requirements for a MM description ontology. And I have also a question
for the list: do we consider cataloging information too or only "pure
content" description?
The document [1] is issued by the International Press Telecommunications
Council and is the result of a larger collaboration; it "specifies
metadata properties intended to be used primarily but not exclusively
with photos". More specifically, "IPTC Photo Metadata provides data
about photographs and the values can be processed by software. Each
individual metadata entity is called a property and they are grouped
into Administrative, Descriptive and Rights Related properties." These
metadata could be applied to describe Multimedia documents too, and some
links between different vocabularies are made: the metadata are
described in natural language and show possible links with the
“G2-Standard” (see [2] for example) and XMP [3] representation format.
As for the link with different vocabularies: for instance, the Title
property aligns with the Dublin Core "Title" element and the properties
that have the mention (legacy) should be filled in by keywords from
different controlled vocabularies.
This set of metadata is aimed primarily at journalists, which explains
some of the modeling choices. It is, in my opinion, a good starting
point (amongst others) for listing mandatory description/metadata items,
nevertheless it contains a number of drawbacks for a generic
image/multimedia description scheme:
- Ambiguous modeling decisions: the “Keyword” property is supposed to
get a free text value, and not keyword value as expected ("Keywords to
express the subject of the content. Keywords may be free text and don't
have to be taken from a controlled vocabulary."), whereas the "Subject
Code" field has to be filled with controlled vocabulary from the IPTC
Subject NewsCodes [4].
- Redundant (and thus ambiguous) modeling decisions: the metadata set
contains a Title, Header, Caption field that all describe the content of
the image, but that should/can all be different: it is hard to make the
distinction between these if you are not one of the expert users the
Specification is aiming at. In a generic multimedia annotation ontology,
we could make a selection between these and decide to align either with
all of these fields (and find a way to define their semantics
precisely), but most likely only with one subset.
- Lack of relationships between the fields: there are some content
description fields like Events, Location, Person, Object or Artwork
Shown on the image, but one image, and moreover one Multimedia document,
contains often more than one event, person, location; multiple Events
etc can be specified with this description model, but to get
satisfactory answers to precise queries, or to be able to disambiguate
between different documents (particularly relevant in large homogeneous
document collections), a formal relationship between the event, person
and location has to be made. For example if a picture is about two Heads
of State shaking hands at a Summit, attended by other Heads of State, an
explicit relationship has to be made between the ones who are shaking
hands and the event “shaking hands”.
The StructuredAnnotation of MPEG-7 (see [5] and example below) enables
to explicit such a relationship; more genrally, I think that an
annotation system based on graphs explicating relationships between the
Who/What/When/Where/Why/How would improve browsing and searching in
Multimedia documents collections.
Example of StructuredAnnotation, taken from [5].
<StructuredAnnotation>
<Who>
<Name xml:lang="en">Zinedine Zidane</Name>
</Who>
<WhatAction>
<Name xml:lang="en">Zinedine Zidane scoring against England.</Name>
</WhatAction>
</StructuredAnnotation>
The NewsML ontology [6], associated with Named Graphs, could also enable
this type of links. I think that the possibility of such graphs should
be present in a multimedia annotation schema, to enable as precise
annotations as possible; the relationship between the different metadata
elements (person/event/location) could be derived automatically in some
cases (from text or context in the flow/still image), so having the
possibility to integrate this context in an annotation would bring an
added value, in my opinion. And I would be very interested to know what
you think about this point!
[1]
http://www.iptc.org/std/photometadata/2008/specification/IPTC-PhotoMetadata-2008_2.pdf
[2] http://www.newsml.org/pages/
[3] http://www.adobe.com/products/xmp/
[4] http://www.iptc.org/NewsCodes/
[5] http://www.w3.org/2005/Incubator/mmsem/XGR-mpeg7/
[6] http://homepages.cwi.nl/~troncy/research.html
Received on Tuesday, 16 September 2008 10:09:18 UTC