Intro and 'use cases'

Hi

I'm David Singer, multimedia standards guy at Apple;  I attend 
various standards bodies and track a few more (Blu-ray, MPEG, IETF 
multimedia, 3GPP multimedia, and so on, as well as the W3C).  I'm the 
editor and chair of the MP4 file format spec/group.

At MPEG I socialize(d) with the MPEG-7 and MPEG-21 people, but I 
wasn't involved in those standards directly (apart from file format 
issues).

Considerations for media annotation:

I'd like to draw a distinction between material that is needed for 
the web to function properly -- attributes, and so on -- and 
annotation which is there to help explain or connect at a semantic 
level.  I know that this distinction is not hard and fast, but 
nonetheless I think it's useful.  For example, 'what codec(s) are 
used in the content' is really a key characteristic of the content 
and directly affects interop.  'what is the title of the media file?' 
is really an annotative question.  I am not convinced that expressing 
accessibility aspects is an annotation question, as it directly 
affects the choice and configuration of the file presented to a user. 
I posted a suggestion recently to public-html 
<http://lists.w3.org/Archives/Public/public-html/2008Sep/0118.html>.

There has recently been work in the image space on settling on a 
select few tags that have well-defined meanings that it is 
recommended images support -- even though the group does *not* 
mandate the way the way they are stored in any given format.  This 
means that any image can be queried for (for example) its copyright 
string and this has a well-defined common meaning.

Common meaning is a useful term here.  Metadata (annotations) are not 
easily convertible between annotation systems or vocabularies, 
usually.  "Is what this system calls the 'name' of the work the same 
as what that system calls the 'title'?".  This is sometimes (often) 
answered by something soft like "well, usually, except...".  This 
causes problems in two areas:
a) converting from one format to another;
b) making uniform queries of a disparate set of resources ('please 
catalog by title all the works in this collection').

The latter is very much a W3C problem, as an embedded media element 
may be in a variety of formats, or even (in HTML5) have different 
alternative forms.  We nonetheless want to be able to do uniform 
queries.

There are two solutions, perhaps, to this problem:  (a) relate all 
media annotation systems by means of a firm semantic background, so 
that a machine translator can do the best it can ('the tag called 
title is the formal_name of the work', 'the tag called author is the 
formal_name of the person who created the words of the work');  (b) 
have a small set of tags which we encourage should be implemented in 
any standard.

We prefer (b) now;  (a) is a research project, not a standards 
activity.  As a basis here, we'd like to consider the 
very-commonly-used ID3 tags (to the extent that they are defined).

There are cross-site security concerns that we should consider here; 
IMG has limited media annotation because of this.  [The issue comes 
up when you construct a public web page that I load that also loads a 
multimedia resource from within my security envelope -- e.g. internal 
to Apple -- and then use scripts to interrogate that resource and 
send the results outside the envelope.]

Internationalization needs to be considered;  we may want to be able 
to tag the name of the movie as presented in spanish, or as presented 
in spain, as well as its normal mongolian name.  But care needs to be 
taken;  'what is the mongolian copyright' should not get the answer 
'none' if there is an established copyright in a jurisdiction which 
mongolian copyright recognizes.

We may need intrinsic annotation ('within the media file') and also 
extrinsic ('associated with the media file').  Again considering that 
HTML5 allows for codec-variants of the same resource, it may be 
easier to say
<video...>
   <source src=".../x.ogg" annotations=".../x.w3c_annot"/>
   <source src=".../x.mp4" annotations=".../x.w3c_annot"/>
</video>
ie. associate the same set of annotations with multiple disparate media files.
-- 
David Singer
Apple/QuickTime

Received on Tuesday, 23 September 2008 00:45:18 UTC