- From: Harald Alvestrand <harald@alvestrand.no>
- Date: Wed, 14 Nov 2012 15:31:28 +0100
- To: public-media-capture@w3.org
On 11/14/2012 01:01 PM, Adam Bergkvist wrote:
> Thank for taking the time to think about this Martin.
>
> On 2012-11-13 23:33, Martin Thomson wrote:
>> I've been doing some thinking about this problem and I think that I
>> agree with Harald in many respects. The interaction between the
>> different instances becomes unclear. At least with a composition style
>> API, this would be clearer.
>
> I'm not married to the inheritance model at all. My intention was to
> align with the proposal for controlling local devices.
> OutboundVideoTrack exposes API surface related to sending video over
> the network similar to how VideoDeviceTrack (from the settings
> proposal) exposes API surface to control a local video device. I'm
> open to other approaches.
>
> (Thinking out loud here) I recall a previous version (v3) of the
> settings proposal [set-v3] that used an approach where a VideoDevice
> (which exposed the settings API surface) had a reference to the track
> it controlled instead of being a new derived track type. Aligning
> inbound/outbound streams to that would look something like:
>
> * : composition
> +- : inheritance
hm. I think I'm not parsing.
I use the term "composition" in terms of "bringing multiple dissimilar
things together inside an object".
Since you're using * below between track and stream, it looks as if
you're also using it for containment - bringing multiple things of the
same type together inside an object.
>
> AbstractMediaStream
> |
> +- LocalMediaStream
> | * AudioDevice
> | * Track
> | * VideoDevice
> | * Track
This doesn't compute, since devices can be on multiple tracks.
> |
> ...
> |
> +- PeerConnectionMediaStream *new*
> | * MediaStreamTransportList (audioTransports)
> | * OutbundAudioTransport...
> | * Track
> | * MediaStreamTransportList (videoTransports)
> | * OutbundVideoTransport...
> | * Track
This doesn't compute, since transports are not hierarchical with
mediastreams.
Alternate, to take the other extreme:
MediaStream contains (composition)
TrackContainer (containment) (one or two)
MediaStreamTrack contains (composition)
DeviceReference? (present when linked to a device source)
OutgoingTransportReference? (present when linked to a PC's
outgoing stream list)
IncomingTransportReference? (present when linked to a PC's
incoming stream list)
Inheritance hierarchy:
Device
+ AudioDevice
+ VideoDevice
+ PictureDevice
Perhaps we don't need so much inheritance....
> |
> ...
>
> Then we could have single track instances for each media source and
> how a track is consumed depends on the Device or Transport that holds
> it and imposes settings on it.
>
> pc.addStream(localStream);
> var outboundStream = pc.localStreams.getStreamById(localStream.id);
>
> localStream.audioDevice.track === outboundStream.audioTransports[0].track
>
> would be true.
>
> [set-v3]
> http://lists.w3.org/Archives/Public/public-media-capture/2012Aug/0143.html
>
>> I have an alternative solution to Harald's underlying problems. The
>> inheritance thing turned out to be superficial only. I have come to
>> believe that the source of confusion is the lack of a distinction
>> between the source of a stream and the stream itself.
>>
>> I think that a clear elucidation of the model could be helpful, to
>> start.
>>
>> --
>>
>> Cameras and microphones are instances of media sources.
>> RTCPeerConnection is a different type of media source.
>>
>> Streams (used here as a synonym for MediaStreamTrack) represent a
>> reduction of the current operating mode of the source. For example, a
>> camera might produce a 1080p capture natively that is down-sampled to
>> produce a 720p stream. Constraints select an operating mode, settings
>> filter the resulting output to match the requested form.
>>
>> Streams are inactive unless attached to a sink. Sinks include <video>
>> and <audio> tags; RTCPeerConnection; or recording and sampling.
>>
>> Sources can produce multiple streams simultaneously. Simultaneous
>> streams require compatible camera modes. A camera that is capable of
>> operating in 16:9 or 4:3 modes might be incapable of producing streams
>> in both those aspect ratios simultaneously.
>>
>> The first stream created for a given source sets the operating mode of
>> the source. Subsequent streams can only be added if the operating mode
>> is compatible with the current mode.
>>
>> The same stream/track can be added to multiple MediaStream instances.
>> The conclusion thus far is that a stream is implicitly cloned by doing
>> so. Because the stream has the same configuration
>> (constraints/settings), this is trivially possible. This allows streams
>> to be independently ended or configured (with constraints/settings).
>
> I guess it's an open question if you should be able to apply
> constraints to a track in a cloned MediaStream or if that should be
> exclusive to the "local" track you got from gUM().
>
>> The first problem is that identification of streams is troublesome. The
>> assumption thus far is that the cloned stream shares the same identity
>> as its prototype. This is because the identifier in question is an
>> identifier for the *source* and not the stream. We should fix that.
>>
>> This implies that MediaStreamTrack::id should actually be
>> MediaStreamTrack::sourceId, as it is currently used, though the
>> interaction with constraints are unclear.
>
> I've also thought about this in a similar way. It's really a source id.
>
>> A better solution would be to have MediaStreamTrack::source and
>> MediaStreamTrack::id. Where MediaStreamTrack:: source is clearly
>> identified and can be used to correlate different streams from the same
>> basic source. MediaStreamTrack::id allows two streams from the same
>> source to be distinguished when they have different constraints, which
>> might be useful for cases like simulcast.
>>
>>
>> Source -- (Mode) -- (Settings) ------------- (Sink Limits) -- Sink
>>
>> A stream is able to communicate information about its consumption by
>> sink(s) back to the originating source. (Real-time streams provide this
>> capability using RTCP; in-browser streams can use internal feedback
>> channels.) This allows sources to make choices about operating mode
>> that is optimized for actual uses. If your 1080p camera is only being
>> displayed or transmitted at 480p, it might choose to switch to a more
>> power-efficient mode as long as this remains true.
>>
>> Information about how a stream is used can traverse the entire media
>> path. For instance, resizing a video sink down might propagate back so
>> that the source is only required to produce the lower resolution.
>> Re-constraining the stream might result in a change to the operating
>> mode of the source. Some sinks require unconstrained access to the
>> source: sampling or recording a stream would negate any optimizations
>> that might otherwise be possible.
>>
>>
>> Source (1) -- (0..*) MediaStreamTrack (..) -- (0..*) Sink
>>
>> This arrangement is less than optimal when it comes to attachment of a
>> single stream to multiple sinks. If the same stream can be attached to
>> multiple sinks, the implicit constraints applied by those sinks are not
>> made visible in quite the same way. Any limits applied by a sink must
>> first be merged with those from other sinks on the same stream. More
>> importantly, it means that sinks cannot end their attached stream
>> without also affecting other users of the same stream.
>
> I don't think that sinks should be allowed to end the streams they're
> consuming. It should be up to the source and API.
>
>> Adam's proposal effectively creates this clone for RTCPeerConnection.
>
> It's not merely a clone. It's an object that describes the association
> between the stream and the PeerConnection it was added to. It uses the
> same media sources though so in that sense it's a clone. But anything
> you do on the "outbound" stream only affects what's sent on the single
> related PeerConnection instance.
>
>> The stream used by RTCPeerConnection is a clone of the stream that it
>> is given. This addresses the concern for output to RTCPeerConnection,
>> but it does not address other uses (<audio> and <video> particularly).
>
> A media element has it's own track lists that describes the media it's
> playing. It offers some control where you, e.g., can set which tracks
> that should be enabled.
>
> http://dev.w3.org/html5/spec/media-elements.html#audiotracklist-and-videotracklist-objects
>
>
>> URL.createObjectURL() seems like a candidate for this. I am coming to
>> the conclusion that createObjectURL() is no longer an entirely
>> appropriate style of API for this use case; direct assignment is better.
>>
>>
>> Source (1) -- (0..*) MediaStreamTrack (..) -- (0..1) Sink
>>
>> Inline images 2
>>
>> Now, after far too many words, on a largely tangential topic, back on
>> task...
>>
>> I believe that composition APIs for stats and DTMF are more likely to be
>> successful than inheritance APIs. As it stands, going to your
>> RTCPeerConnection instance to get stats is ugly, but it is superior to
>> what Adam proposes.
>
> I would say that the difference between (1) pc.sendDTMF(targetTrack,
> ...) and (2) outbundAudio.sendDTMF(...) is that the association
> between the track and the PeerConnection is kept internal in the
> PeerConnection in (1). The association is then looked up with the
> targetTrack argument when pc.sendDTMF() is called. While in (2), the
> association is exposed in form the outbound track. Implementation-wise
> they wouldn't have to be that different.
>
> The reason why I think it could be beneficial to expose the
> track-PeerConnection association is that it we probably want to do
> more than DTMF and Stats in the future (like e.g., bandwidth and
> priority).
>
>> What this proposal has over existing APIs is a much-needed measure of
>> transparency. I think that we need to continue to explore options like
>> this. I find the accrual of methods on RTCPeerConnection to be
>> problematic, not just from an engineering perspective, but from a
>> usability perspective.
>>
>> For stats, a separate RTCStatisticsRecorder class would be much easier
>> to manage, even if it had to be created by RTCPeerConnection. That
>> would be consistent with the chosen direction on DTMF.
>
> As long as we're consistent I could pretty much live with any solution.
>
> /Adam
>
Received on Wednesday, 14 November 2012 14:32:03 UTC