Re: Review: Use Case & Requirements Draft

Hi all,

I should probably give some comments on Davy's review. Sorry for the delay.

On Thu, Dec 4, 2008 at 7:12 PM, Davy Van Deursen
<davy.vandeursen@ugent.be> wrote:
>
> Hi all,
>
> please find below our review for the Use cases and requirements document
> (ACTION-14). It is based on the discussion regarding Raphaël's review of
> this document.
>
>>-----Original Message-----
>>From: public-media-fragment-request@w3.org [mailto:public-media-
>>fragment-request@w3.org] On Behalf Of Silvia Pfeiffer
>>Sent: Saturday, November 08, 2008 4:35 AM
>>To: Raphaël Troncy
>>Cc: Media Fragment
>>Subject: Re: Review: Use Case & Requirements Draft
>>
> [SNIP]
>>> * Section 1.1:
>
>>>  - (scenario 2): I'm not sure we want to say that only the region of
>>an
>>> image should be displayed. What about saying: "Tim wants the region of
>>the
>>> photo highlighted and the rest grey scaled"?
>>
>>If we get the rest of the image in grey scale, then we have to receive
>>the rest of the data. This is not a fragment then. In this case, you
>>would receive all the image and do some transformation on some parts.
>>That is not what media fragments would be for IMO.
>
> IMO, the UC 'Display of media fragments' is not equal to
> 'Delivery/Adaptation of media fragments'. In other words, media fragment
> URIs could be used by the user agent to highlight (i.e., point out, display)
> the media fragment within the parent media resource. Note that no
> transformation is needed on server-side. The server sends the full media
> resource, the user agent displays the full media resource and interprets
> fragment URIs and 'highlights' them for example with a rectangular. Further,
> annotations could be displayed next to the highlighted fragments.

I agree with the notion of having server-side fragments and
client-side "overlay" fragments as a use case. However, I don't think
such client-side image filters should be represented through a URI. I
think they fall in the same class as image maps.


>>> * Section 1.4:
>>>  - This use case is again for me an application of 1.1, that is this
>>time
>>> linking for recomposing (making playlist)
>>
>>Recompositing poses very different challenges to an application that
>>just playback. I can see where you come from - for you it is just a
>>piece of content delivered to a user agent and you don't really care
>>what the user agent does with it. However, I am looking at use cases
>>from the user's POV. A user would not regard watching a video as the
>>same use case as editing clips together. I am just weary that we might
>>oversee some things if we throw these use cases together too much.
>
> Note that there is another issue with the recomposition of media fragments.
> For instance, consider scenario 3 where video fragments (possibly all having
> different parent media resources) are put together. Since different parent
> resources implies potentially different underlying media formats, things
> become complicated if we expect a smooth playback of a mashup of media
> fragments. This is because different underlying media formats require the
> (re-)initialization of the decoder(s) during playback of the mashup.
> Therefore, I think we should make clear that in this scenario, we do not
> expect smooth transitions between the different media fragments in the
> playlist, because otherwise, this use case is far from trivial to implement.

I agree. I've added a sentence to this effect to the wiki page.


>>> * Section 1.6:
>>>  - (scenario 1): I find it out-of-scope. I think it is worth to let it
>>in
>>> the document but to say that this is where we think it is out of scope
>>...
>>> if everybody agrees :-)
>>
>>I tend to agree. However, I would like to qualify why we think it's
>>out of scope. I think what should be in scope with fragments are where
>>we create what we called at the F2F "croppings" of the original file,
>>i.e. where we take the original file, make a selection of bytes, and
>>return these to the user agent. No transformations or recoding is done
>>on the original data (apart from potentially changing a file header).
>>This however means that as soon as we have a codec that is adaptable,
>>i.e. where a lower framerate version or a smaller image resolution can
>>be created without having to re-encode the content, we may have to
>>consider these use cases as being part of media fragmentation.
>>
>>Maybe what we can agree on is that this is potential for future work
>>as such codecs evolve and become more common. It is not our general
>>use case right now though.
>>
>>What do ppl think?
>
> We think that this is indeed out-of-scope. I always call these two kind of
> adaptations 'structural' and 'semantic' adaptations. Structural adaptations
> (such as frame rate scaling, resolution scaling, bit rate reduction) do not
> change the semantics of the media resource. They only lower the quality of
> the media resource along a particular axis (i.e., temporal, spatial, SNR,
> colour, ...). Semantic adaptations do change the semantics of the media
> resource by cropping along a particular axis (i.e., temporal, spatial,
> track, ...), but they do not influence the quality of the resulting content.
> IMO, only semantic adaptations result in 'fragments' of the original media
> resource. Structural adaptations result in (lower quality) 'versions' of the
> original media resource. Furthermore, I think structural adaptation
> parameters do not belong in a URI scheme. These kind of adaptations are
> typically taken on server-side or in network adaptation nodes based on the
> usage environment of the user agent (e.g., network conditions, end-user
> device characteristics, ...).

I agree. Do people like the semantic - structural nomenclature? Then
we should probably include that into our terminology (glossary) and
use it here to explain that this use case is out of scope and why. For
now, I have added a single sentence simply stating that this use case
is out of scope.


>>>  - (scenario 2): I find this scenario also really borderline / out-of-
>>scope.
>>>  As it has been pointed out during the face to face meeting in Cannes,
>>the
>>> interactivity seems to be the most important aspect in the map use
>>cases
>>> (reflecting by zooming in/out, panning over, etc.) and I guess we
>>don't want
>>> that in our URI scheme. Do we?
>>
>>I included this because we had a map use case. I'd be quite happy to
>>decide this to be out of scope, but wanted to give the proponents of
>>that use case a chance to speak up.
>
> Only the panning is in-scope I think. We should keep a use case such that,
> similar to the temporal (scen. 1) and track axis (scen. 3 & 4), we have a
> use case regarding adapting a media resource to obtain a media fragment
> along the spatial axis.

Panning is a time-spatial rectangular fragment? Zooming can also be
done through a rectangular spatial fragment I would have though...


>>> * Section 2:
>>>
>>> I have hard time to understand what do you mean with these technology
>>> requirements. I understand the need for enabling other Web
>>technologies to
>>> satisfy their use cases but I'm not sure this is strong enough to make
>>a
>>> next headline. Actually, I can easily see all the subsections merged
>>with
>>> the existing use cases, see below. Therefore, I would suggest to
>>remove the
>>> section 2.
>>
>>So, section 2 looks at media fragments from a technology POV, not from
>>a user POV. Yes, most of these technology rquirements are expressed in
>>the use cases above. However, they are not expressed as such. It is a
>>different dimension of description of the needs that we have.
>>
>>I have tried to write an introduction to this section. I believe it is
>>important to explicitly spell out these dimensions to make people from
>>these areas that have a large interest in media fragment uris
>>understand explicitly that they are being catered for.
>
> I tend to agree with Raphaël. I still cannot see why for example sect. 2.1
> (which introduces named fragments) is not a part of sect. 1.5 (i.e.,
> Annotating media fragments). What do other people think regarding Sect. 2?

OK - I'm finding it hard to defend the separation of section 2,
because they really came from different target groups - one are users
and the other technologists. But since that doesn't really matter, I
am happy to merge them together.

As for 2.1 specifically: the difference between 2.1 (named fragments)
and 1.5 (media annotations) is that the first is about giving the
video a structure (a segmentation) and each of the segments a name to
address them with - while the second is about providing time-aligned
annotations for a media file. The focus in the earlier is structure,
the focus in the latter is metadata.


> Further, the term 'side-conditions' does not reflect the covered
> requirements because they are more than 'side'-requirements IMO. Therefore,
> we propose to change the titles of sect. 1 and 3 in 'Functional requirements
> (Application Use Cases)' and 'Non-functional requirements'.

I guess section 3 are agreed design principles for the solution.
Anyone has a better name to head that section with?
Also, I don't think we need to change the "use cases" title - that's
pretty accurate.


>>> * Section 3.8:
>>>  - I find this requirement very strong and I feel we are still
>>discussing
>>> the issue. Perhaps we can phrase that as: "we should avoid to decode
>>and
>>> recompress media resource" ?
>>
>>I'd like to have this discussion and come to a clear conclusion,
>>because it will make things a lot more complicated if we allow
>>recompression. Davy and I have discussed this thoroughly. Can ppl
>>express their opinions on this? In fact, is anyone for allowing
>>recompression (i.e. transcoding) in the media fragment URI addressing
>>process?
>
> I think we cannot make the statement that transcoding is not allowed to
> obtain media fragments. Of course, it is preferable to avoid transcoding.
> Therefore, I propose to remove sect. 3.8 and add to sect. 3.9 that one
> aspect of minimizing the impact on existing infrastructure is to avoid
> transcoding of media resources to obtain media fragments. Further on sect.
> 3.9, I agree that we should minimize the impact on existing infrastructure,
> but this may not be an ultimate goal if this results in too much loss of
> functionality (e.g., consider specialized video caches).

It seems our discussion yesterday concluded with the statement that we
allow transcoding only where it's lossless. While I may personally
disagree, there is agreement in the group and we should probably
change this statement. Can somebody give it a shot at reformulating it
who feels passionate about it?


Cheers,
Silvia.

Received on Wednesday, 10 December 2008 07:45:09 UTC