- From: Harald Alvestrand <harald@alvestrand.no>
- Date: Wed, 14 Nov 2012 15:31:28 +0100
- To: public-media-capture@w3.org
On 11/14/2012 01:01 PM, Adam Bergkvist wrote: > Thank for taking the time to think about this Martin. > > On 2012-11-13 23:33, Martin Thomson wrote: >> I've been doing some thinking about this problem and I think that I >> agree with Harald in many respects. The interaction between the >> different instances becomes unclear. At least with a composition style >> API, this would be clearer. > > I'm not married to the inheritance model at all. My intention was to > align with the proposal for controlling local devices. > OutboundVideoTrack exposes API surface related to sending video over > the network similar to how VideoDeviceTrack (from the settings > proposal) exposes API surface to control a local video device. I'm > open to other approaches. > > (Thinking out loud here) I recall a previous version (v3) of the > settings proposal [set-v3] that used an approach where a VideoDevice > (which exposed the settings API surface) had a reference to the track > it controlled instead of being a new derived track type. Aligning > inbound/outbound streams to that would look something like: > > * : composition > +- : inheritance hm. I think I'm not parsing. I use the term "composition" in terms of "bringing multiple dissimilar things together inside an object". Since you're using * below between track and stream, it looks as if you're also using it for containment - bringing multiple things of the same type together inside an object. > > AbstractMediaStream > | > +- LocalMediaStream > | * AudioDevice > | * Track > | * VideoDevice > | * Track This doesn't compute, since devices can be on multiple tracks. > | > ... > | > +- PeerConnectionMediaStream *new* > | * MediaStreamTransportList (audioTransports) > | * OutbundAudioTransport... > | * Track > | * MediaStreamTransportList (videoTransports) > | * OutbundVideoTransport... > | * Track This doesn't compute, since transports are not hierarchical with mediastreams. Alternate, to take the other extreme: MediaStream contains (composition) TrackContainer (containment) (one or two) MediaStreamTrack contains (composition) DeviceReference? (present when linked to a device source) OutgoingTransportReference? (present when linked to a PC's outgoing stream list) IncomingTransportReference? (present when linked to a PC's incoming stream list) Inheritance hierarchy: Device + AudioDevice + VideoDevice + PictureDevice Perhaps we don't need so much inheritance.... > | > ... > > Then we could have single track instances for each media source and > how a track is consumed depends on the Device or Transport that holds > it and imposes settings on it. > > pc.addStream(localStream); > var outboundStream = pc.localStreams.getStreamById(localStream.id); > > localStream.audioDevice.track === outboundStream.audioTransports[0].track > > would be true. > > [set-v3] > http://lists.w3.org/Archives/Public/public-media-capture/2012Aug/0143.html > >> I have an alternative solution to Harald's underlying problems. The >> inheritance thing turned out to be superficial only. I have come to >> believe that the source of confusion is the lack of a distinction >> between the source of a stream and the stream itself. >> >> I think that a clear elucidation of the model could be helpful, to >> start. >> >> -- >> >> Cameras and microphones are instances of media sources. >> RTCPeerConnection is a different type of media source. >> >> Streams (used here as a synonym for MediaStreamTrack) represent a >> reduction of the current operating mode of the source. For example, a >> camera might produce a 1080p capture natively that is down-sampled to >> produce a 720p stream. Constraints select an operating mode, settings >> filter the resulting output to match the requested form. >> >> Streams are inactive unless attached to a sink. Sinks include <video> >> and <audio> tags; RTCPeerConnection; or recording and sampling. >> >> Sources can produce multiple streams simultaneously. Simultaneous >> streams require compatible camera modes. A camera that is capable of >> operating in 16:9 or 4:3 modes might be incapable of producing streams >> in both those aspect ratios simultaneously. >> >> The first stream created for a given source sets the operating mode of >> the source. Subsequent streams can only be added if the operating mode >> is compatible with the current mode. >> >> The same stream/track can be added to multiple MediaStream instances. >> The conclusion thus far is that a stream is implicitly cloned by doing >> so. Because the stream has the same configuration >> (constraints/settings), this is trivially possible. This allows streams >> to be independently ended or configured (with constraints/settings). > > I guess it's an open question if you should be able to apply > constraints to a track in a cloned MediaStream or if that should be > exclusive to the "local" track you got from gUM(). > >> The first problem is that identification of streams is troublesome. The >> assumption thus far is that the cloned stream shares the same identity >> as its prototype. This is because the identifier in question is an >> identifier for the *source* and not the stream. We should fix that. >> >> This implies that MediaStreamTrack::id should actually be >> MediaStreamTrack::sourceId, as it is currently used, though the >> interaction with constraints are unclear. > > I've also thought about this in a similar way. It's really a source id. > >> A better solution would be to have MediaStreamTrack::source and >> MediaStreamTrack::id. Where MediaStreamTrack:: source is clearly >> identified and can be used to correlate different streams from the same >> basic source. MediaStreamTrack::id allows two streams from the same >> source to be distinguished when they have different constraints, which >> might be useful for cases like simulcast. >> >> >> Source -- (Mode) -- (Settings) ------------- (Sink Limits) -- Sink >> >> A stream is able to communicate information about its consumption by >> sink(s) back to the originating source. (Real-time streams provide this >> capability using RTCP; in-browser streams can use internal feedback >> channels.) This allows sources to make choices about operating mode >> that is optimized for actual uses. If your 1080p camera is only being >> displayed or transmitted at 480p, it might choose to switch to a more >> power-efficient mode as long as this remains true. >> >> Information about how a stream is used can traverse the entire media >> path. For instance, resizing a video sink down might propagate back so >> that the source is only required to produce the lower resolution. >> Re-constraining the stream might result in a change to the operating >> mode of the source. Some sinks require unconstrained access to the >> source: sampling or recording a stream would negate any optimizations >> that might otherwise be possible. >> >> >> Source (1) -- (0..*) MediaStreamTrack (..) -- (0..*) Sink >> >> This arrangement is less than optimal when it comes to attachment of a >> single stream to multiple sinks. If the same stream can be attached to >> multiple sinks, the implicit constraints applied by those sinks are not >> made visible in quite the same way. Any limits applied by a sink must >> first be merged with those from other sinks on the same stream. More >> importantly, it means that sinks cannot end their attached stream >> without also affecting other users of the same stream. > > I don't think that sinks should be allowed to end the streams they're > consuming. It should be up to the source and API. > >> Adam's proposal effectively creates this clone for RTCPeerConnection. > > It's not merely a clone. It's an object that describes the association > between the stream and the PeerConnection it was added to. It uses the > same media sources though so in that sense it's a clone. But anything > you do on the "outbound" stream only affects what's sent on the single > related PeerConnection instance. > >> The stream used by RTCPeerConnection is a clone of the stream that it >> is given. This addresses the concern for output to RTCPeerConnection, >> but it does not address other uses (<audio> and <video> particularly). > > A media element has it's own track lists that describes the media it's > playing. It offers some control where you, e.g., can set which tracks > that should be enabled. > > http://dev.w3.org/html5/spec/media-elements.html#audiotracklist-and-videotracklist-objects > > >> URL.createObjectURL() seems like a candidate for this. I am coming to >> the conclusion that createObjectURL() is no longer an entirely >> appropriate style of API for this use case; direct assignment is better. >> >> >> Source (1) -- (0..*) MediaStreamTrack (..) -- (0..1) Sink >> >> Inline images 2 >> >> Now, after far too many words, on a largely tangential topic, back on >> task... >> >> I believe that composition APIs for stats and DTMF are more likely to be >> successful than inheritance APIs. As it stands, going to your >> RTCPeerConnection instance to get stats is ugly, but it is superior to >> what Adam proposes. > > I would say that the difference between (1) pc.sendDTMF(targetTrack, > ...) and (2) outbundAudio.sendDTMF(...) is that the association > between the track and the PeerConnection is kept internal in the > PeerConnection in (1). The association is then looked up with the > targetTrack argument when pc.sendDTMF() is called. While in (2), the > association is exposed in form the outbound track. Implementation-wise > they wouldn't have to be that different. > > The reason why I think it could be beneficial to expose the > track-PeerConnection association is that it we probably want to do > more than DTMF and Stats in the future (like e.g., bandwidth and > priority). > >> What this proposal has over existing APIs is a much-needed measure of >> transparency. I think that we need to continue to explore options like >> this. I find the accrual of methods on RTCPeerConnection to be >> problematic, not just from an engineering perspective, but from a >> usability perspective. >> >> For stats, a separate RTCStatisticsRecorder class would be much easier >> to manage, even if it had to be created by RTCPeerConnection. That >> would be consistent with the chosen direction on DTMF. > > As long as we're consistent I could pretty much live with any solution. > > /Adam >
Received on Wednesday, 14 November 2012 14:32:03 UTC