- From: Adam Bergkvist <adam.bergkvist@ericsson.com>
- Date: Wed, 14 Nov 2012 16:48:21 +0100
- To: Harald Alvestrand <harald@alvestrand.no>
- CC: public-media-capture@w3.org
On 2012-11-14 15:31, Harald Alvestrand wrote: > On 11/14/2012 01:01 PM, Adam Bergkvist wrote: >> Thank for taking the time to think about this Martin. >> >> On 2012-11-13 23:33, Martin Thomson wrote: >>> I've been doing some thinking about this problem and I think that I >>> agree with Harald in many respects. The interaction between the >>> different instances becomes unclear. At least with a composition style >>> API, this would be clearer. >> >> I'm not married to the inheritance model at all. My intention was to >> align with the proposal for controlling local devices. >> OutboundVideoTrack exposes API surface related to sending video over >> the network similar to how VideoDeviceTrack (from the settings >> proposal) exposes API surface to control a local video device. I'm >> open to other approaches. >> >> (Thinking out loud here) I recall a previous version (v3) of the >> settings proposal [set-v3] that used an approach where a VideoDevice >> (which exposed the settings API surface) had a reference to the track >> it controlled instead of being a new derived track type. Aligning >> inbound/outbound streams to that would look something like: >> >> * : composition >> +- : inheritance > hm. I think I'm not parsing. > I use the term "composition" in terms of "bringing multiple dissimilar > things together inside an object". > Since you're using * below between track and stream, it looks as if > you're also using it for containment - bringing multiple things of the > same type together inside an object. I'm using composition as a stronger "has a" relationship. For example, (from below) a LocalMediaStream has an AudioDevice. http://en.wikipedia.org/wiki/Class_diagram#Composition >> >> AbstractMediaStream >> | >> +- LocalMediaStream >> | * AudioDevice >> | * Track >> | * VideoDevice >> | * Track > This doesn't compute, since devices can be on multiple tracks. This comes unmodified from [set-v3]. I think idea was to have a device tied to a track in the LocalMediaStream. The track could then be used to create regular MediaStreams and would then exist without the device (that accompanied it in the LocalMediaStream). >> | >> ... >> | >> +- PeerConnectionMediaStream *new* >> | * MediaStreamTransportList (audioTransports) >> | * OutbundAudioTransport... >> | * Track >> | * MediaStreamTransportList (videoTransports) >> | * OutbundVideoTransport... >> | * Track > > This doesn't compute, since transports are not hierarchical with > mediastreams. That's not the intention either. The PeerConnectionMediaStream has a list of audio and video transports. Each transport has a corresponding track (similar to the track-device relationship above). > Alternate, to take the other extreme: > > MediaStream contains (composition) > TrackContainer (containment) (one or two) > MediaStreamTrack contains (composition) > DeviceReference? (present when linked to a device source) > OutgoingTransportReference? (present when linked to a PC's > outgoing stream list) > IncomingTransportReference? (present when linked to a PC's > incoming stream list) > I interpret this as a regular MediaStream that's not created in association with a PeerConnection. One issue is that it can only be sent with one PeerConnection unless it has a list of OutgoingTransportReferences. It also introduces the inbound/outbound stuff to the local only use case. > > Inheritance hierarchy: > > Device > + AudioDevice > + VideoDevice > + PictureDevice > > Perhaps we don't need so much inheritance.... > >> | >> ... >> >> Then we could have single track instances for each media source and >> how a track is consumed depends on the Device or Transport that holds >> it and imposes settings on it. >> >> pc.addStream(localStream); >> var outboundStream = pc.localStreams.getStreamById(localStream.id); >> >> localStream.audioDevice.track === outboundStream.audioTransports[0].track >> >> would be true. >> >> [set-v3] >> http://lists.w3.org/Archives/Public/public-media-capture/2012Aug/0143.html >> >> >>> I have an alternative solution to Harald's underlying problems. The >>> inheritance thing turned out to be superficial only. I have come to >>> believe that the source of confusion is the lack of a distinction >>> between the source of a stream and the stream itself. >>> >>> I think that a clear elucidation of the model could be helpful, to >>> start. >>> >>> -- >>> >>> Cameras and microphones are instances of media sources. >>> RTCPeerConnection is a different type of media source. >>> >>> Streams (used here as a synonym for MediaStreamTrack) represent a >>> reduction of the current operating mode of the source. For example, a >>> camera might produce a 1080p capture natively that is down-sampled to >>> produce a 720p stream. Constraints select an operating mode, settings >>> filter the resulting output to match the requested form. >>> >>> Streams are inactive unless attached to a sink. Sinks include <video> >>> and <audio> tags; RTCPeerConnection; or recording and sampling. >>> >>> Sources can produce multiple streams simultaneously. Simultaneous >>> streams require compatible camera modes. A camera that is capable of >>> operating in 16:9 or 4:3 modes might be incapable of producing streams >>> in both those aspect ratios simultaneously. >>> >>> The first stream created for a given source sets the operating mode of >>> the source. Subsequent streams can only be added if the operating mode >>> is compatible with the current mode. >>> >>> The same stream/track can be added to multiple MediaStream instances. >>> The conclusion thus far is that a stream is implicitly cloned by doing >>> so. Because the stream has the same configuration >>> (constraints/settings), this is trivially possible. This allows streams >>> to be independently ended or configured (with constraints/settings). >> >> I guess it's an open question if you should be able to apply >> constraints to a track in a cloned MediaStream or if that should be >> exclusive to the "local" track you got from gUM(). >> >>> The first problem is that identification of streams is troublesome. The >>> assumption thus far is that the cloned stream shares the same identity >>> as its prototype. This is because the identifier in question is an >>> identifier for the *source* and not the stream. We should fix that. >>> >>> This implies that MediaStreamTrack::id should actually be >>> MediaStreamTrack::sourceId, as it is currently used, though the >>> interaction with constraints are unclear. >> >> I've also thought about this in a similar way. It's really a source id. >> >>> A better solution would be to have MediaStreamTrack::source and >>> MediaStreamTrack::id. Where MediaStreamTrack:: source is clearly >>> identified and can be used to correlate different streams from the same >>> basic source. MediaStreamTrack::id allows two streams from the same >>> source to be distinguished when they have different constraints, which >>> might be useful for cases like simulcast. >>> >>> >>> Source -- (Mode) -- (Settings) ------------- (Sink Limits) -- Sink >>> >>> A stream is able to communicate information about its consumption by >>> sink(s) back to the originating source. (Real-time streams provide this >>> capability using RTCP; in-browser streams can use internal feedback >>> channels.) This allows sources to make choices about operating mode >>> that is optimized for actual uses. If your 1080p camera is only being >>> displayed or transmitted at 480p, it might choose to switch to a more >>> power-efficient mode as long as this remains true. >>> >>> Information about how a stream is used can traverse the entire media >>> path. For instance, resizing a video sink down might propagate back so >>> that the source is only required to produce the lower resolution. >>> Re-constraining the stream might result in a change to the operating >>> mode of the source. Some sinks require unconstrained access to the >>> source: sampling or recording a stream would negate any optimizations >>> that might otherwise be possible. >>> >>> >>> Source (1) -- (0..*) MediaStreamTrack (..) -- (0..*) Sink >>> >>> This arrangement is less than optimal when it comes to attachment of a >>> single stream to multiple sinks. If the same stream can be attached to >>> multiple sinks, the implicit constraints applied by those sinks are not >>> made visible in quite the same way. Any limits applied by a sink must >>> first be merged with those from other sinks on the same stream. More >>> importantly, it means that sinks cannot end their attached stream >>> without also affecting other users of the same stream. >> >> I don't think that sinks should be allowed to end the streams they're >> consuming. It should be up to the source and API. >> >>> Adam's proposal effectively creates this clone for RTCPeerConnection. >> >> It's not merely a clone. It's an object that describes the association >> between the stream and the PeerConnection it was added to. It uses the >> same media sources though so in that sense it's a clone. But anything >> you do on the "outbound" stream only affects what's sent on the single >> related PeerConnection instance. >> >>> The stream used by RTCPeerConnection is a clone of the stream that it >>> is given. This addresses the concern for output to RTCPeerConnection, >>> but it does not address other uses (<audio> and <video> particularly). >> >> A media element has it's own track lists that describes the media it's >> playing. It offers some control where you, e.g., can set which tracks >> that should be enabled. >> >> http://dev.w3.org/html5/spec/media-elements.html#audiotracklist-and-videotracklist-objects >> >> >>> URL.createObjectURL() seems like a candidate for this. I am coming to >>> the conclusion that createObjectURL() is no longer an entirely >>> appropriate style of API for this use case; direct assignment is better. >>> >>> >>> Source (1) -- (0..*) MediaStreamTrack (..) -- (0..1) Sink >>> >>> Inline images 2 >>> >>> Now, after far too many words, on a largely tangential topic, back on >>> task... >>> >>> I believe that composition APIs for stats and DTMF are more likely to be >>> successful than inheritance APIs. As it stands, going to your >>> RTCPeerConnection instance to get stats is ugly, but it is superior to >>> what Adam proposes. >> >> I would say that the difference between (1) pc.sendDTMF(targetTrack, >> ...) and (2) outbundAudio.sendDTMF(...) is that the association >> between the track and the PeerConnection is kept internal in the >> PeerConnection in (1). The association is then looked up with the >> targetTrack argument when pc.sendDTMF() is called. While in (2), the >> association is exposed in form the outbound track. Implementation-wise >> they wouldn't have to be that different. >> >> The reason why I think it could be beneficial to expose the >> track-PeerConnection association is that it we probably want to do >> more than DTMF and Stats in the future (like e.g., bandwidth and >> priority). >> >>> What this proposal has over existing APIs is a much-needed measure of >>> transparency. I think that we need to continue to explore options like >>> this. I find the accrual of methods on RTCPeerConnection to be >>> problematic, not just from an engineering perspective, but from a >>> usability perspective. >>> >>> For stats, a separate RTCStatisticsRecorder class would be much easier >>> to manage, even if it had to be created by RTCPeerConnection. That >>> would be consistent with the chosen direction on DTMF. >> >> As long as we're consistent I could pretty much live with any solution. >> >> /Adam >> > >
Received on Wednesday, 14 November 2012 15:48:47 UTC