RE: Elaborating on the Video use case

I think we need to wonder a little about what is doing the 
interaction with the media resource.

Perhaps the easiest use-case is a web site that allows for upload, 
and wants to display selected information about the media clips it 
gets.  That use-case asks not only for metadata but also 
web/DOM-level APIs that give uniform access to selected metadata 
across a variety of file formats.

Search engines are, in some senses, out of scope (but read on).  The 
crawl the web, and find media files, but how they extract the 
metadata from those media files is a private arrangement for them 
('off the web');  they don't use web APIs (I think) and they can 
handle whatever formats they like in whatever way they like.  The 
sense in which this is in scope is that we'd like search/index 
engines again to be able to do uniform indexing of selected metadata 
across a variety of formats, so again we need some level of semantic 
match for those metadata elements across a variety of formats.

The semantic (mis)match problem is easily illustrated.  Consider two 
metadata systems:

A has tags for Title, Artist
B has tags for Title, Sub-Title, Artist, Composer

We find the same work in these two formats;
A   Title="Dvorak Symphony 6, II Adagio", Artist="BBC Symphony Orchestra"
B   Title="Symphony 6", Sub-title="II Adagio", Artist="BBC Symphony 
Orchestra", Composer="Dvorak, Antonin"

What does the DOM API return when the script asks for "Artist" -- 
does the composer get included from file B, even though in A he's 
been put in the title(faute de mieux)?  Indeed,  does the first file 
ever get indexed under the name of Dvorak?  And so on.



One simple case is that people with ownership rights in media will be 
very unhappy if a web page *cannot* access basic information about 
ownership (the copyright notice, for example).  It's not that it must 
be present in every file, or accessed by every page, but that every 
file should be capable of carrying the notice, and any page should be 
capable of getting it if it's there.


Other things to think about:
* is the annotation structured or simple?  So, for example, is a 
person a structured element with family name, given name, birthdate, 
and so on, or is it a string "Dvorak, Antonin"?

* are annotations temporal (possibly varying in time) or atemporal? 
Most metadata systems today treat it as atemporal ('what is the 
copyright?') but this runs into problems when e.g. media is pasted 
together, or for TV-like stations.  I am tempted to say that all 
queries should be relative to a time-point and all answers return the 
bracketing time-range over which the answer is valid (which might be 
large or even of indefinite extent):
what is the copyright at time 10?  from 5 thru 2005 it is "(C) Acme 
digital 1665".

* what about the data-type of annotations?  Most annotation systems 
today use strings, but this makes life interesting when a metadata 
item is the cover art of an album.  There really aren't great ways to 
handle this kind of typed binary data in typical DOM/scripting 
environments, as I understand it.


well, there are many more...
-- 
David Singer
Apple/QuickTime

Received on Friday, 3 October 2008 18:09:11 UTC