- From: Véronique Malaisé <vmalaise@few.vu.nl>
- Date: Tue, 16 Sep 2008 12:10:26 +0100
- To: public-media-annotation@w3.org
Sorry about the late fulfillment of my task, I had a writer's blank... and I am not really sure yet whether this document is what was expected, but I made a try and I'm waiting for your comments! I have had a look at the IPTC Standard - Photo Metadata 2008 [1] to draw a set of requirements for a MM description ontology. And I have also a question for the list: do we consider cataloging information too or only "pure content" description? The document [1] is issued by the International Press Telecommunications Council and is the result of a larger collaboration; it "specifies metadata properties intended to be used primarily but not exclusively with photos". More specifically, "IPTC Photo Metadata provides data about photographs and the values can be processed by software. Each individual metadata entity is called a property and they are grouped into Administrative, Descriptive and Rights Related properties." These metadata could be applied to describe Multimedia documents too, and some links between different vocabularies are made: the metadata are described in natural language and show possible links with the “G2-Standard” (see [2] for example) and XMP [3] representation format. As for the link with different vocabularies: for instance, the Title property aligns with the Dublin Core "Title" element and the properties that have the mention (legacy) should be filled in by keywords from different controlled vocabularies. This set of metadata is aimed primarily at journalists, which explains some of the modeling choices. It is, in my opinion, a good starting point (amongst others) for listing mandatory description/metadata items, nevertheless it contains a number of drawbacks for a generic image/multimedia description scheme: - Ambiguous modeling decisions: the “Keyword” property is supposed to get a free text value, and not keyword value as expected ("Keywords to express the subject of the content. Keywords may be free text and don't have to be taken from a controlled vocabulary."), whereas the "Subject Code" field has to be filled with controlled vocabulary from the IPTC Subject NewsCodes [4]. - Redundant (and thus ambiguous) modeling decisions: the metadata set contains a Title, Header, Caption field that all describe the content of the image, but that should/can all be different: it is hard to make the distinction between these if you are not one of the expert users the Specification is aiming at. In a generic multimedia annotation ontology, we could make a selection between these and decide to align either with all of these fields (and find a way to define their semantics precisely), but most likely only with one subset. - Lack of relationships between the fields: there are some content description fields like Events, Location, Person, Object or Artwork Shown on the image, but one image, and moreover one Multimedia document, contains often more than one event, person, location; multiple Events etc can be specified with this description model, but to get satisfactory answers to precise queries, or to be able to disambiguate between different documents (particularly relevant in large homogeneous document collections), a formal relationship between the event, person and location has to be made. For example if a picture is about two Heads of State shaking hands at a Summit, attended by other Heads of State, an explicit relationship has to be made between the ones who are shaking hands and the event “shaking hands”. The StructuredAnnotation of MPEG-7 (see [5] and example below) enables to explicit such a relationship; more genrally, I think that an annotation system based on graphs explicating relationships between the Who/What/When/Where/Why/How would improve browsing and searching in Multimedia documents collections. Example of StructuredAnnotation, taken from [5]. <StructuredAnnotation> <Who> <Name xml:lang="en">Zinedine Zidane</Name> </Who> <WhatAction> <Name xml:lang="en">Zinedine Zidane scoring against England.</Name> </WhatAction> </StructuredAnnotation> The NewsML ontology [6], associated with Named Graphs, could also enable this type of links. I think that the possibility of such graphs should be present in a multimedia annotation schema, to enable as precise annotations as possible; the relationship between the different metadata elements (person/event/location) could be derived automatically in some cases (from text or context in the flow/still image), so having the possibility to integrate this context in an annotation would bring an added value, in my opinion. And I would be very interested to know what you think about this point! [1] http://www.iptc.org/std/photometadata/2008/specification/IPTC-PhotoMetadata-2008_2.pdf [2] http://www.newsml.org/pages/ [3] http://www.adobe.com/products/xmp/ [4] http://www.iptc.org/NewsCodes/ [5] http://www.w3.org/2005/Incubator/mmsem/XGR-mpeg7/ [6] http://homepages.cwi.nl/~troncy/research.html
Received on Tuesday, 16 September 2008 10:09:18 UTC