Re: MediaStreams and Media elements from Rob Manson on 2013-05-30 (public-media-capture@w3.org from May 2013)

From: Rob Manson <roBman@mob-labs.com>
Date: Thu, 30 May 2013 15:24:24 +1000
To: public-media-capture@w3.org
Message-ID: <51A6E288.4090007@mob-labs.com>
Hi Johannes,

well that's exactly what was being thrashed around in the thread I 
linked to 
http://lists.w3.org/Archives/Public/public-media-capture/2013Feb/0099.html

But it didn't seem like that was really resolved.

Also, by "the ImageCapture API" do you mean work under the title:
- HTML Media Capture
- MediaStream Recording
- Media Capture and Streams
?

To be honest I'm finding it a little difficult to balance the threads 
across all the closely related activities 8)

 From my naive perspective it just seems closely related to gUM() (e.g. 
Media Capture and Streams) because that's what I'd use to initiate the 
process - however this type of stream could also easily come in via a 
peerconnection now we have WebRTC too.  Happy to be pointed in the right 
direction if this is not right though.

NOTE: To provide context, this is based on the Use Cases need for the 
Augmented Web Community Group.  I'd be happy to write up more info if 
it's needed but don't want to flood this thread unless asked to.

roBman

On 30/05/13 14:58, Johannes Odland wrote:
> Isn't this the exact use case for the ImageCapture API?
>
> Johannes Odland
>
> Den 30. mai 2013 kl. 06:52 skrev Rob Manson <roBman@mob-labs.com>:
>
>> Since this is being discussed I wondered if there had been any more
>> discussion on this frame capture thread.
>>
>> http://lists.w3.org/Archives/Public/public-media-capture/2013Feb/0099.html
>>
>> NOTE: I've searched all the archives but that seems to be the last point
>> in the discussion (please feel free to point me at a link if I'm wrong).
>>
>> I know it's not "exactly" related, but it would be very nice to be able
>> to directly assign a stream src to a typed array and just get a
>> notification every time it is updated.
>>
>> And it would be awesome if this worked for both video and audio streams
>> in the same way.
>>
>> Just to recap - the current pipeline for video frame processing is:
>>
>> step 1.
>> getUserMedia(options, success, error)
>>
>> step 2.
>> function success(stream) {
>>   videoElement.src = window.URL.createObjectURL(stream)
>>   //setupSomeEventModelOfYourChoiceToCallProcess()
>>     //a. videoElement.onloadedmetadata
>>     //b. requestAnimationFrame()
>>     //c. setTimeout()
>> }
>>
>> step 3.
>> function process() {
>>   canvasContext.drawImage(videoElement, top, left, width, height)
>>   typedArray = canvasContext.getImageData(top, left, width, height)
>>   //finallyDoStuffWithTheImageDataTypeArray
>> }
>>
>>
>> step 1 and 2 are not so bad because they're only used once to setup the
>> pipeline.
>>
>> But step 3 is called 10s or even 100s of times per second...and this
>> seems to be copying data a lot more than it needs to be...at least:
>> - into the videoElement
>> - then into the canvas context
>> - then into a typed array
>> (probably more under the hood depending upon the implementation)
>>
>> Ideally we could just pour a stream directly into a typed array and have
>> a way to only process it when it is actually updated (if we choose).
>>
>> Even better would be something like if we were just handed a new typed
>> array for each frame as they were created so we could minimise the
>> copying of the data and do things like pass it to a worker as a
>> transferable object using postMessage so we can do image processing off
>> the main thread.  This would reduce data copying even further while
>> allowing us to keep the main browser UI more responsive.
>>
>> Thoughts?
>>
>> roBman
>>
>>
>> On 29/05/13 05:35, Jim Barnett wrote:
>>> It’s time to revise section 8 of the spec, which deals with how to pass
>>> a MediaStream to an HTML5 <audio> or <video> element (see
>>> http://dev.w3.org/2011/webrtc/editor/getusermedia.html#mediastreams-as-media-elements
>>> )  One question is how to deal with readyState and networkState
>>> attributes of the media element.  The HTML5 spec has a media element
>>> load algorithm which first resolves the URI of the src, and then
>>> attempts to fetch the source.   The current gUM spec says that when the
>>> algorithm reaches the fetch phase, if the resource is a MediaStream, the
>>> algorithm should terminate and set readyState to HAVE_ENOUGH_DATA.  I
>>> think that this is correct in the case of a MediaStream that is
>>> streaming data, but:
>>>
>>> 1.The spec should also say that networkState gets set to NETWORK_IDLE.
>>>
>>> 2.Does it matter if the Tracks in the MediaStream are muted or
>>> disabled?  My guess is that it doesn’t – the output will just be silence
>>> or black frames, but we should clarify this.  (By the way, the current
>>> spec says that the output of a muted Track is silence or black frames,
>>> but doesn’t say what the output is for a disabled Track.  Shouldn’t it
>>> be the same?)
>>>
>>> 3.What happens if the MediaStream that is fetched has ended = true?
>>>   Should we silently continue to use the dead stream and let HTML5
>>> figure out what to do, or should we raise an error?  In the latter case,
>>> the HTML5 spec defines a MediaError  Media_ERR_Aborted , which we might
>>> be able to use.  It is defined as “The fetching process for the media
>>> resource
>>> <http://www.w3.org/TR/2012/CR-html5-20121217/embedded-content-0.html#media-resource>
>>> was aborted by the user agent at the user's request.”  Isn’t  that sort
>>> of what happens when a local MediaStream is ended?
>>>
>>> 4.Do we want to say anything about remote MediaStreams?  In the case of
>>> a local MediaStream, NETWORK_IDLE makes sense for the networkState,
>>> because there is no network traffic.  But for a remote stream the
>>> NETWORK_LOADING state might be relevant.  On the other hand, the  Media
>>> Capture spec seems implicitly to deal with local streams (created by
>>> gUM).  If we want to explicitly allow remote streams, we have to explain
>>> how they are created, etc.   I suppose we could  say that streams can be
>>> remote, but the method of creating such a stream is outside the scope of
>>> this spec.  But then we’d at least have to say how the UA determines if
>>> a MediaStream is local or remote.
>>>
>>> 5.What do we say if a MediaStream with no Tracks is passed to a media
>>> element (i.e., in the fetch phase of the algorithm)?  Do we treat this
>>> as if the media element had fetched unplayable data? There is a
>>> |/MEDIA_ERR_SRC_NOT_SUPPORTED/|  that we could  use in this case.  Or is
>>> it another Media_ERR_Aborted?  The fetch algorithm checks for the
>>> presence of audio and video tracks at a certain point, and any Tracks
>>> added after that won’t be detected  (until load() is called again.)
>>>
>>> I also have questions about how direct assignment should work, but I
>>> will send them in a separate email.
>>>
>>> -Jim
>>
>
>
Received on Thursday, 30 May 2013 05:24:54 UTC