RE: detailed comments on the use case and reqs 1.0 draft


Of cause not. 

Thank you so much.




From: [] On Behalf Of Felix Sasaki
Sent: Friday, April 17, 2009 10:34 PM
To: David Singer
Subject: Re: detailed comments on the use case and reqs 1.0 draft


Hello Dave, all,

would somebody object if I try to integrate the (non structural) comments? That seems to be pretty straightforward.


2009/4/16 David Singer <>

Hi, after a read-through of <> I have some detailed comments...

1:  change 'needs' to 'need'..."In addition, video services on the web...need..."

3:  below the diagram, change 'both' to 'either' (unless somehow the media resource is annotated in both formats at once):  "that will return values from either the XMP or IPTC metadata"

3:  The last paragraph, the words "applications, like:" and what follows would be better phrased as "applications.  For example:" and then follow with an HTML list (bulleted, probably).

Requirement r4, I would change "custom" to "user-defined" or "non-standard", I think.

5.2: towards the end, "etc." is missing its "."

5.3:  this has the feel of having been edited a few times and the language is now a bit odd.  Can I suggest:

People nowadays are able to enjoy large number of programs from different content providers (broadcasting companies, Internet video website, etc.). To achieve better user experience, reduce the user's experience of being overloaded, and hence retain users, some systems provide recommendations based on the user's history, ratings, or stated preferences. However, different content providers usually have their specific or proprietary metadata models, which is one of the key problems faced by recommendation service providers. A common ontology spanning different metadata sets can allow recommendation systems to return a better, larger, and more relevant selection than when the metadata systems are unrelated.

Company A is an IPTV add-value service provider. One of their services is to recommend programs that users might like, based on their watching history or explicit rating of programs. In this system, users are able to watch regular TV programs with electronic program guide (EPG) format metadata, videos such as from YouTube, with website-specific metadata, etc. In order to perform uniform and effective recommendation in the absence of a common set of vocabularies, they would need to design own integrated media annotation model.

5.5:  the set over which "Find all" is not well identified, I assume it's "within a database, such as that of a search engine indexing the internet or other web-accessible content (e.g. a corporate repository, library, etc.)".

5.7:  I think there is a major use case that needs mentioning: accessibility.  There are requirements that for users who are unable to consume time-based media in general, or some formats in particular, the media data have annotations and links that express a summary, transcript, and so on.

6:  Requirements.  Very little is said here about the format of the returned result.  Most metadata systems are appallingly vague about the format of the stored data (often merely saying it's a string), even when the key suggests a restricted value set (e.g. "creation date").  This should probably be reflected in 6.13 requirement r13, "allow for undefined/unformatted return values for the same property" as well as the current text.

More structural comments:

security: we need to say we understand that user-agent access to metadata might give rise to cross-site scripting and other security issues, but that we expect those issues to be handled in the same way as the same issue for images and other embedded data.

media ID:  we probably want to say that a major question both users and search engines might like to know is whether two pieces of media, two URLs etc. essentially are referring to the same content.  This can really only be done with a uniform media identifier system, and unfortunately there is none (despite ISRC etc.).

time varying:  some metadata is naturally time-varying (e.g. "what chapter or scene of this movie am I in?") and the cue-ranges design of HTML5 is designed to support this (e.g. flipping slides to go with a video of someone speaking).  While I realize that many major metadata systems express time-invariant metadata ('for the whole file'), it might be prudent to document how we intend to handle this. It could be by using URLs that include fragment identifiers in the query (i.e. a script could find a URL on the page and explicitly add "#t=30s" to the end), or it could be by separate arguments to the API.  Some discussion of this captured in the document would be prudent, I think.
David Singer
Multimedia Standards, Apple Inc.


Received on Friday, 17 April 2009 14:35:26 UTC