Review of use cases and requirements from Guillaume Olivrin on 2008-11-18 (public-media-fragment@w3.org from November 2008)

From: Guillaume Olivrin <golivrin@meraka.org.za>
Date: Tue, 18 Nov 2008 09:58:24 +0200
To: Media Fragment <public-media-fragment@w3.org>
Message-Id: <1226995104.5690.45.camel@video-lab>
Here's my review of
http://www.w3.org/2008/WebVideo/Fragments/wiki/Use_Cases_%
26_Requirements_Draft 
, dated November 13.

Globally, the Use Cases and Requirements that have been captured are
very satisfactory and give a good coverage of the application ranges.
We may still need to vote whether some of these use cases (such as the
Adaptation use case and its scenarios) are out of scope, or if we just
want to reserve them for later work.

Here some additional comments and suggestions:


Use Cases

Linking & Display of Media Fragments 

-> Linking & Positioning of Media Fragments

This use case is concerned with linkage and positioning within a media
resource. 
I would say Positioning rather than Display. Displaying is a User Agent
concern however Positioning is more of a media state.

Display did however provide a notion of "context" that has not been
explored in the scenarios. What if I want a viewing window around my
current Media Fragment URIs positioning point which loads 2 minutes
before the point I request to be positioned at, and 8 minutes after?
Would there be a scenario that can both refer to a point in the video
and also precise the viewing range (context)? 

Browsing Media Fragments

Media Pagination -> good temporal scenario, maybe be hard to understand
for non initiative to assistive technologies but very appropriate.
What about spatial pagination? 
    Can Media Fragment URIs cut a big image into areas and create a
navigation scenario where each parts of the image are presented, one at
a time (a new browsing functionality that's different from zooming or
scrolling). This is very much a basic tiling functionality such as the
ones that online map services use but without the need for a dedicated
server.
    A scenario could also involve the spatial pagination of a video to
linearise a video mozaic ( when many channels are displayed at the same
time in one image but which is difficult to follow, focalized attention)



Adapting Media Resources 

I think it's ok that scenarios 3 and 4 fall under Adapting Media
Resources. Although the adaptation is not a full processing operation
( i.e. transcoding or resampling operations ) it caters for the cases
where the nature of the media might be changed, e.g. from video to
audio, from video to image. These are additional scenarios (that may be
declared out of scope) where by, by fragmenting the media according to
tracks, we may obtain a different media type than the original one.
E.g. 
    Tim only select audio from a video file. Would the media fragment
served be considered a video without visual track, or could it become
adapted and considered and audio media file?
    Tim retrieves an image snapshot out of video. This is now an image
which can handled very differently than a video file might be.
    Tim extract the textual transcripts out of the subtitle tracks of a
video file, would these remain video subtitles or can they become TXT
mime type?

Additional scenarios:
Scenario 5: (different from scenario 2, the image preview already exists
in the file)
Tim found a GIS image of his region online, but it is a big file. He
puts a link on his webpage but he also wants a thumbnail, so he accesses
the thumbnail that, luckily, is stored within the image format. He can
now use the fragment to show a preview of the overall file on his
webpage without any processing required.
NB: also for select 2D projection out of 3D image data-cubes (FITS,
Dicom), or a specific resolution out of multi-resolution image files
(JPEG2000). 

Scenario 6: (reverse of scenario 3)
Sebo is Deaf and enjoys watching videos on the Web. Her friend sent her
a link to a new music video URI but she doesn't want to waste time and
bandwidth receiving the sound. So when she enters the URI in her
browser's address bar, she also adds an extra parameter that precise to
only retrieve the visual fragment of the video for her. 

Requirements

Media Annotations

Maybe we should add a scenario that makes it explicit that multiple
Media Fragment URIs can be used to described moving regions (the hardest
case) for example using external schemes such as RDF or MPEG-7. (At the
moment Scenario 1 only associates 1 RDF description to 1 URI, whereas on
RDF might describes many relations between a whole set of Media Fragment
URIs thus describing evolving ranges).



Media Accessibility


The introduction is in fact an argument for accessing the logical
structure of media, i.e. the media 'segments' Jack and Silvia mentioned.
Segmented media is linked to navigating such media hierarchically by
Chapters, by Sections (etc..).
This might create a different requirement altogether, one where we need
to be able to access a hierarchy of named anchors, just the same way we
access a hierarchy of time (we can ask for HH01 or HH01MM11 and even
HH01MM11SS34). Hierarchy might not be the right way to call it, rather
levels of granularity of fragment anchors. I am not sure this is
desirable yet, but the requirement is there.

Scenario 1 is very much like Scenario 2 of Browsing Media Fragments.
There are two faces of the same coin, maybe we should make a
cross-reference.

We could create a third scenario that shows that the structure of the
media need not be external (CMML) but can also be in the media
(Matroska, VOB - Chapters, Titles, Angles).

We could also add a fourth scenario where accessibility takes the form
of choosing between various alternatives tracks in a media : precising
in the URI that FR subtitles are requested (if exists), precising that
whatever audio tracks there are should be off (this might be different
than saying I just want video, in case there are subtitle tracks),
precising that a specific audience  (when many audiences are available).
This requirement may be linked to the Adapting Media Resources use case.
It could also be left entirely to SMIL for example. But then, what if a
video is muxed and a SMIL presentation want to access audio and video
separately to provide a different view of the media, shouldn't media
fragment be the recourse?


Side Conditions



Single Fragment

        A media fragments URI should create only a single "mask" onto a
        media resource and not a collection of potentially overlapping
        fragments.

Should it also say something about not being able to select
non-continuous temporal or spatial regions? (or does single mask makes
it clear enough?)
Also this supposes that Media Fragment URIs will be used to select
ranges, but we have never quite declared this to be the case. For
example the first use case, linking to a position in the media, is not
per se a fragment selection unless we want to create a context. I would
distinguish between :

Single fragment is the notion that we will be able to do simple select
(time) and crop (space) operations on primary media.
Fragment Singularity on the other hand precises that Media Fragment URIs
define positions as singular points in the media (not ranges).


End of Review.

Guillaume.
Received on Tuesday, 18 November 2008 08:17:00 UTC