Re: Review of use cases and requirements from Silvia Pfeiffer on 2008-11-18 (public-media-fragment@w3.org from November 2008)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Tue, 18 Nov 2008 22:21:01 +1100
To: "Guillaume Olivrin" <golivrin@meraka.org.za>
Cc: "Media Fragment" <public-media-fragment@w3.org>
Message-ID: <2c0e02830811180321p6e5b59bfmbe36a41841cca717@mail.gmail.com>
Hi Guillaume, all,

On Tue, Nov 18, 2008 at 6:58 PM, Guillaume Olivrin
<golivrin@meraka.org.za> wrote:
> Globally, the Use Cases and Requirements that have been captured are very
> satisfactory and give a good coverage of the application ranges.
> We may still need to vote whether some of these use cases (such as the
> Adaptation use case and its scenarios) are out of scope, or if we just want
> to reserve them for later work.

I agree. We should leave all the use cases there, but specify for each
one whether it is in or out of scope.


> Here some additional comments and suggestions:
>
> Use Cases
>
> Linking & Display of Media Fragments
> -> Linking & Positioning of Media Fragments
>
> This use case is concerned with linkage and positioning within a media
> resource.
> I would say Positioning rather than Display. Displaying is a User Agent
> concern however Positioning is more of a media state.

I think I understand where you're coming from: it's about moving the
decoding position/region in the media resource to a particular
fragment. "Display" is just one thing you can do, once your have moed
the decoding position/region.

However, I really don't like the word "positioning". It has a
political connotation for me - a semantic one. And even if I think
about it in technical terms, a media fragment position doesn't really
sound right.

OTOH, I actually really like "display". It actually tells me what a
user *does* with the video. It's about *use* cases, right? So, we
should actually mention what the users *do* with the media fragments.
If possible, I'd like to keep this title.


> Display did however provide a notion of "context" that has not been explored
> in the scenarios. What if I want a viewing window around my current Media
> Fragment URIs positioning point which loads 2 minutes before the point I
> request to be positioned at, and 8 minutes after? Would there be a scenario
> that can both refer to a point in the video and also precise the viewing
> range (context)?

Already in Scenario 1 I say it relates to "clips". A clip is indeed
not a time point, but a time range. Is that what you were worried
about? Start-- and end-time?

If however you are worried about being able to "rewind" and "fast
forward" beyond a selected media fragment range, that is another
matter. We could add another use case for that if people are worried.
To me it's just another URI that is being sent to the server (linking)
and that specifies a new retrieval range for display.


> Browsing Media Fragments
>
> Media Pagination -> good temporal scenario, maybe be hard to understand for
> non initiative to assistive technologies but very appropriate.
> What about spatial pagination?
>     Can Media Fragment URIs cut a big image into areas and create a
> navigation scenario where each parts of the image are presented, one at a
> time (a new browsing functionality that's different from zooming or
> scrolling). This is very much a basic tiling functionality such as the ones
> that online map services use but without the need for a dedicated server.
>     A scenario could also involve the spatial pagination of a video to
> linearise a video mozaic ( when many channels are displayed at the same time
> in one image but which is difficult to follow, focalized attention)

I like this idea. Feel free to add a scenario to the wiki for it!


> Adapting Media Resources
>
> I think it's ok that scenarios 3 and 4 fall under Adapting Media Resources.
> Although the adaptation is not a full processing operation ( i.e.
> transcoding or resampling operations ) it caters for the cases where the
> nature of the media might be changed, e.g. from video to audio, from video
> to image. These are additional scenarios (that may be declared out of scope)
> where by, by fragmenting the media according to tracks, we may obtain a
> different media type than the original one.
> E.g.
>     Tim only select audio from a video file. Would the media fragment served
> be considered a video without visual track, or could it become adapted and
> considered and audio media file?

Very interesting question. I guess it would turn into an audio
resource, but would need to know that it stemmed from a video
resource, and maybe even where the video resource is located.

>     Tim retrieves an image snapshot out of video. This is now an image which
> can handled very differently than a video file might be.

It's indeed a thumbnail.

>     Tim extract the textual transcripts out of the subtitle tracks of a
> video file, would these remain video subtitles or can they become TXT mime
> type?

Depends if they are written to a text file or continue to be
encapsulated inside a media resource container. I would hope that if
one asked only for a text codec, one would receive a text file.

> Additional scenarios:
> Scenario 5: (different from scenario 2, the image preview already exists in
> the file)
> Tim found a GIS image of his region online, but it is a big file. He puts a
> link on his webpage but he also wants a thumbnail, so he accesses the
> thumbnail that, luckily, is stored within the image format.

I think this should read: can be retrieved from the video format

> He can now use
> the fragment to show a preview of the overall file on his webpage without
> any processing required.

I thought scencario 1 covered this use case, since I mentioned it
would be possible to just extract thumbnails. I'm happy to add it as
an explicit extra use case though, if you prefer.

> NB: also for select 2D projection out of 3D image data-cubes (FITS, Dicom),
> or a specific resolution out of multi-resolution image files (JPEG2000).


> Scenario 6: (reverse of scenario 3)
> Sebo is Deaf and enjoys watching videos on the Web. Her friend sent her a
> link to a new music video URI but she doesn't want to waste time and
> bandwidth receiving the sound. So when she enters the URI in her browser's
> address bar, she also adds an extra parameter that precise to only retrieve
> the visual fragment of the video for her.

Cool! Feel free to add it! :-)


> Requirements
>
> Media Annotations
>
> Maybe we should add a scenario that makes it explicit that multiple Media
> Fragment URIs can be used to described moving regions (the hardest case) for
> example using external schemes such as RDF or MPEG-7. (At the moment
> Scenario 1 only associates 1 RDF description to 1 URI, whereas on RDF might
> describes many relations between a whole set of Media Fragment URIs thus
> describing evolving ranges).

OK, fair enough. Feel free to add it.


> Media Accessibility
>
> The introduction is in fact an argument for accessing the logical structure
> of media, i.e. the media 'segments' Jack and Silvia mentioned.
> Segmented media is linked to navigating such media hierarchically by
> Chapters, by Sections (etc..).
> This might create a different requirement altogether, one where we need to
> be able to access a hierarchy of named anchors, just the same way we access
> a hierarchy of time (we can ask for HH01 or HH01MM11 and even HH01MM11SS34).
> Hierarchy might not be the right way to call it, rather levels of
> granularity of fragment anchors. I am not sure this is desirable yet, but
> the requirement is there.
>
> Scenario 1 is very much like Scenario 2 of Browsing Media Fragments. There
> are two faces of the same coin, maybe we should make a cross-reference.

Yes, they are. I could have added Scenario 1 there, but I preferred to
use it here to make a point that if we want browsing, we need media
that has a logical structure.


> We could create a third scenario that shows that the structure of the media
> need not be external (CMML) but can also be in the media (Matroska, VOB -
> Chapters, Titles, Angles).

CMML is used both inside and external of media, which is why I used it
as an example. :-)


> We could also add a fourth scenario where accessibility takes the form of
> choosing between various alternatives tracks in a media : precising in the
> URI that FR subtitles are requested (if exists), precising that whatever
> audio tracks there are should be off (this might be different than saying I
> just want video, in case there are subtitle tracks), precising that a
> specific audience  (when many audiences are available). This requirement may
> be linked to the Adapting Media Resources use case. It could also be left
> entirely to SMIL for example. But then, what if a video is muxed and a SMIL
> presentation want to access audio and video separately to provide a
> different view of the media, shouldn't media fragment be the recourse?

I thought it was covered by the Adapting Media Resrouces use case. If
you feel something is missing, please add another scenario there.


> Side Conditions
>
> Single Fragment
>
> A media fragments URI should create only a single "mask" onto a media
> resource and not a collection of potentially overlapping fragments.
>
> Should it also say something about not being able to select non-continuous
> temporal or spatial regions? (or does single mask makes it clear enough?)

Good question. Did we decided to just allow to request a single
logical region at the F2F? I cannot remember. Either may end up in
multiple byte ranges, so that should not be a problem. What I tried to
avoid is "byte overlap".


> Also this supposes that Media Fragment URIs will be used to select ranges,
> but we have never quite declared this to be the case. For example the first
> use case, linking to a position in the media, is not per se a fragment
> selection unless we want to create a context.

For any logical use of the media resource after linking to a position
in the media, it makes sense to specify it as a region that starts at
the linked point and ends when the resource ends. After all, you want
some data and not just a link into the data, right?

> I would distinguish between :
>
> Single fragment is the notion that we will be able to do simple select
> (time) and crop (space) operations on primary media.
> Fragment Singularity on the other hand precises that Media Fragment URIs
> define positions as singular points in the media (not ranges).

Is fragment singularity of any practical use?


Great feedback, thanks!

Cheers,
Silvia.
Received on Tuesday, 18 November 2008 11:21:36 UTC