Re: updates to requirements document from Randell Jesup on 2012-07-11 (public-media-capture@w3.org from July 2012)

From: Randell Jesup <randell-ietf@jesup.org>
Date: Wed, 11 Jul 2012 11:41:35 -0400
To: public-media-capture@w3.org
Message-ID: <4FFD9EAF.8060603@jesup.org>
On 7/11/2012 10:03 AM, Young, Milan wrote:
>
> As a reminder, I’m proposing that we add the following requirement:
> “The UA must allow the Application to access both an encoded
> representation of the media and associated control information needed
> for decoding while capture is in progress.”
>
> Any objections?
>


This begs the question: what encoding? How is that specified? Where is
it encoded? How is the encoder controlled? Bit rate? Congestion control
if it goes over a wire? Typically this goes into PeerConnection and is
encoded in some manner, but the app doesn't have access to the
bytestreams. This proposal begs for the decomposition of codecs and
encoding from PeerConnection, which would be a significant architectural
change. See also Harald's 'rant' about bytestreams from last year
sometime on the webrtc w3 list (I think).

The "translation" case could be handled by either asking PeerConnection
for a high-reliability connection (TCP, or FEC at the cost of
bandwidth), or long re-transmit buffers and have the translation
receiver use NACKs to repair errors. This (if there was some way to get
it encoded) would allow other methods of shipping the audio for
translation (WebSockets for example).

And access to the encoded or decoded media itself is a potential
security issue. See the "MediaStream Security" presentation from the
Mountain View W3C meeting at the interim in February. That may or may
not be relevant here; I haven't thought it through. What's the trust
model? We have one for WebRTC.

And... Defining the associated control information needed for decoding
is a significant task, especially as it would need to be codec-agnostic.
(Which from the conversation I think you realize.) This also is an API
that I believe we at Mozilla (or some of us) disagree with (though I'm
not the person primarily following this; I think Robert O'Callahan and
Tim Terriberry are).

> *From:*Young, Milan [mailto:Milan.Young@nuance.com]
> *Sent:* Friday, July 06, 2012 2:02 PM
> *To:* Travis Leithead; Jim Barnett; Sunyang (Eric);
> public-media-capture@w3.org
> *Subject:* RE: updates to requirements document
>
> The Media Source spec is using the term “Byte Stream” [1] to denote
> the sequence of Initialization and Media Segments that you mention
> below. (Essentially a container format around the raw media.) But yes,
> we are thinking in the same direction and I agree that the exact
> content of that stream should remain implementation and task dependent.
>
> Returning to the topic at hand, we need to define requirements so that
> this group can address the documented use cases. At present, “capture
> audio for a translation site” is dangling.
>
> To this end, we still need to add a new requirement that the capture
> process exposes the bit stream. I suggest the following (adjustments
> since the last iteration in bold): “The UA must allow the Application
> to access *both* an encoded representation of the media *and
> associated control information needed for decoding* while capture is
> in progress.”
>
> Thanks
>
> [1]
> http://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-source.html#byte-stream-formats
>
> *From:*Travis Leithead [mailto:travis.leithead@microsoft.com]
> <mailto:[mailto:travis.leithead@microsoft.com]>
> *Sent:* Friday, July 06, 2012 10:16 AM
> *To:* Young, Milan; Jim Barnett; Sunyang (Eric);
> public-media-capture@w3.org <mailto:public-media-capture@w3.org>
> *Subject:* RE: updates to requirements document
>
> Sounds right to me to.
>
> Off topic:
>
> Based on my reading of MediaSource, in order to interop nicely with
> that API, direct-access to the capture stream (while capture is
> on-going) basically involves making two types of byte sequences
> available (as Uint8Array) /initialization segments and media segments
> /(http://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-source.html#init-segment).
> I don’t even think the capture spec itself would need to define the
> details of those, possibly just identifiers that would allow JS to
> interpret what the underlying format is.
>
> *From:*Young, Milan [mailto:Milan.Young@nuance.com]
> <mailto:[mailto:Milan.Young@nuance.com]>
> *Sent:* Friday, July 6, 2012 8:03 AM
> *To:* Jim Barnett; Sunyang (Eric); public-media-capture@w3.org
> <mailto:public-media-capture@w3.org>
> *Subject:* RE: updates to requirements document
>
> Thanks Jim. That sounds right to me.
>
> *From:*Jim Barnett [mailto:Jim.Barnett@genesyslab.com]
> <mailto:[mailto:Jim.Barnett@genesyslab.com]>
> *Sent:* Friday, July 06, 2012 6:02 AM
> *To:* Sunyang (Eric); Young, Milan; public-media-capture@w3.org
> <mailto:public-media-capture@w3.org>
> *Subject:* RE: updates to requirements document
>
> To summarize the discussion so far: it sounds like we agree that the
> App will sometimes need direct access to the capture stream, and at
> other times will want the capture streamed directly to a file or some
> other sink. The App may also want to combine the two (stream to a file
> while also directly accessing the capture.) The main question is how
> much we need to define in our spec as opposed to pointing to other
> pre-existing specs. Does that sound right to you, Eric and Milan?
>
> -Jim
>
> *From:*Sunyang (Eric) [mailto:eric.sun@huawei.com]
> <mailto:[mailto:eric.sun@huawei.com]>
> *Sent:* Friday, July 06, 2012 2:18 AM
> *To:* Young, Milan; Jim Barnett; public-media-capture@w3.org
> <mailto:public-media-capture@w3.org>
> *Subject:* 答复: updates to requirements document
>
> For “the ability to view the media stream in its encoded form”, I
> think we’d better reference media source API
>
> http://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-source.html
>
> The second paragraph you mentioned is more like the style of media
> source, welcome to html-media task force for discussion.
>
> Yang
>
> Huawei
>
> *发件人**:*Young, Milan [mailto:Milan.Young@nuance.com]
> *发送时间:* 2012年7月6日 14:02
> *收件人:* Sunyang (Eric); Jim Barnett; public-media-capture@w3.org
> <mailto:public-media-capture@w3.org>
> *主题:* RE: updates to requirements document
>
> I believe that there are several use cases that do not require
> exposing the bits of the media stream to the Application layer. So I
> think it’s going too far to say that transport is always the burden of
> the Application. I think a better way to phrase that is to say that
> the Application should always have the ability to view the media
> stream in its encoded form.
>
> The tricky part is defining a canonical media form. Will this be on
> sample intervals, fixed block size, logical compression boundaries, …
> ? I don’t have a strong opinion, but I suspect fixed size blocks of
> data (ie N bytes at a time regardless of what those bytes represent)
> will be easiest to spec and most useful to the largest range of use cases.
>
> Thanks
>
> *From:*Sunyang (Eric) [mailto:eric.sun@huawei.com]
> *Sent:* Thursday, July 05, 2012 7:50 PM
> *To:* Jim Barnett; Young, Milan; public-media-capture@w3.org
> <mailto:public-media-capture@w3.org>
> *Subject:* 答复: updates to requirements document
>
> I wonder how clear the division should be.
>
> I suggest we do not touch the transport/upload part of use cases, but
> we can remove all application responsibility which not relative with
> capture/permission from the requirement, I think this is feasible, and
> easy for improvement.
>
> Yang
>
> Huawei
>
> *发件人**:*Jim Barnett [mailto:Jim.Barnett@genesyslab.com]
> <mailto:[mailto:Jim.Barnett@genesyslab.com]>
> *发送时间:* 2012年7月6日 8:41
> *收件人:* Young, Milan; public-media-capture@w3.org
> <mailto:public-media-capture@w3.org>
> *主题:* RE: updates to requirements document
>
> That sounds reasonable to me. I take you to be saying that
> transport/uploading are the Application’s responsibility, and that the
> only requirement on the UA is that it make the encoded representation
> available. That gives a clear division of responsibilities. Are there
> other opinions? (My guess is that many of the requirements are worded
> incorrectly.)
>
> -Jim
>
> *From:*Young, Milan [mailto:Milan.Young@nuance.com]
> <mailto:[mailto:Milan.Young@nuance.com]>
> *Sent:* Thursday, July 05, 2012 7:37 PM
> *To:* Jim Barnett; public-media-capture@w3.org
> <mailto:public-media-capture@w3.org>
> *Subject:* RE: updates to requirements document
>
> Hello Jim, thanks for putting this together.
>
> The 1^st requirement under REMOTE MEDIA currently states: “The UA must
> be able to transmit media to one or more remote sites and to receive
> media from them.” My concern is that the language is insufficient to
> handle all of the scenarios put forward in the section titled
> “Capturing a media stream” under “Design Considerations and Remarks”.
> These are:
>
> 1)capture a video and upload to a video sharing site
>
> 2)capture a picture for my user profile picture in a given web app
>
> 3)capture audio for a translation site
>
> 4)capture a video chat/conference
>
> The first two transfer types would typically be handled as a bulk
> transfer after capture completes, which is a good fit for conventional
> transports like HTTP. The fourth type is an obvious match to WebRTC.
> The third type is a mix of the two. The application prefers real time
> transmission, but is probably willing to sacrifice a few seconds of
> latency in the interest of reliable transport. Something like an
> application-specific streaming protocol over WebSockets seems appropriate.
>
> My request could be satisfied with the following new requirement: “The
> UA must allow the Application to access an encoded representation of
> the media while capture is in progress.” Implicit in this request is
> that the UA will not always explicitly handle media transfer, but I
> think that could be inferred from the other requirements.
>
> Does this sound reasonable?
>
> Thanks
>
> *From:*Jim Barnett [mailto:Jim.Barnett@genesyslab.com]
> *Sent:* Tuesday, July 03, 2012 6:36 AM
> *To:* public-media-capture@w3.org <mailto:public-media-capture@w3.org>
> *Subject:* updates to requirements document
>
> I have filled out the requirements section in the use case document
> (http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html)
> and added links from the scenarios to the requirements. I have not
> modified any existing content or taken anything out of the document.
>
> There’s still more work to do:
>
> 1) there are some free floating requirements that were suggested on
> the list but not incorporated in any of the scenarios. Do we want to
> incorporate them into the scenarios or leave them as is?
>
> 2) The scenarios contain lists of items that are similar to the
> requirements. Do we want to remove them, or leave them in and modify
> them to match the requirements more closely?
>
> 3) I have organized the requirements into four classes: permissions,
> local media, remote media, and media capture. Maybe it would be better
> to have a different classification or a single list.
>
> Let me know what you think.
>
> -Jim
>



-- 
Randell Jesup
randell-ietf@jesup.org
Received on Wednesday, 11 July 2012 15:43:16 UTC