- From: Rich Tibbett <richt@opera.com>
- Date: Thu, 23 Aug 2012 17:45:43 +0200
- To: Jim Barnett <Jim.Barnett@genesyslab.com>
- CC: Harald Alvestrand <harald@alvestrand.no>, public-media-capture@w3.org
Jim Barnett wrote: > Rich, > One use case for real-time access to media data is speech recognition. > We would like to be able to use media obtained through getUserMedia to > talk to an ASR system. It would be nice if we could just set up a > PeerConnection to the ASR system, but ASR engines don't handle UDP very > well (they can handle delays, but not lost packets.) So either we need > to be able to set up a PeerConnection using TCP, or we need to give the > app access to the audio in real time (and let it set up the TCP to the > ASR engine.) How is this not possible with the following existing pipeline: MediaStream -> HTMLAudioElement -> Web Audio API [1] -> WebSockets -> ASR Service ? By going through the Web Audio API [1] via an <audio> element to obtain ongoing AudioBuffer data from a MediaStream object and then sending that on to a 3rd-party ASR engine via e.g. a WebSocket connection you could achieve the same thing. There may be other existing pipelines that could be used here too. - Rich [1] https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#AudioBuffer-section > > - Jim > > -----Original Message----- > From: Rich Tibbett [mailto:richt@opera.com] > Sent: Thursday, August 23, 2012 10:56 AM > To: Harald Alvestrand > Cc: public-media-capture@w3.org > Subject: Re: Describing recording by means of the Media Source interface > > Harald Alvestrand wrote: >> I'm scanning the Media Source interface, and seeing how it describes >> data formats for the buffers it uses. >> >> It seems to me logical to describe the recording interface in such a >> way >> that: >> >> If there exists a video stream v, a media source msrc and a media >> stream ms, and (conceptually) msProducesData(buffer) is called every >> time data is available at the recording interface, then the following >> code: >> >> // Setup >> v.src = window.URL.createObjectURL(msrc); buffer = >> msrc.addSourceBuffer(mimetype) // So far unknown setup for the >> recorder interface >> >> // playback >> msProducesData(data) { >> buffer.append(data) >> } >> >> >> should produce the same display (possibly somewhat delayed due to >> buffering) as >> >> v.src = window.URL.createObjectURL(ms) >> >> The media source definition is available here: >> >> http://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-sourc >> e.html >> >> >> It seems to me that if we can make sure this actually works, we'll >> have achieved a little consistency across the media handling platform. >> > > I've been trying to figure out exactly the purpose of having access to > _real-time_ buffer data via any type of MediaStream Recording API. It is > fairly clear that byte-level access to recorded data could be solved > with existing interfaces, albeit not in real-time as the media is being > recorded to a file but once a file has already been recorded in its > entirety and returned to the web app. > > If we could simply start recording of a MediaStream with e.g. .start(), > then stop it at some arbitrary point, thereby returning a File object > [1] then we could then pass that object through the existing FileReader > API [2] to chunk it and apply anything we wish to at the byte-level > after the recording has been completed. > > MediaStream content that is currently being recorded via a MediaRecorder > API could be simultaneously displayed to the user in the browser via a > <video> or<audio> tag so ongoing playback of a stream being recorded > seems to be a problem that is already solved in most respects. > > If we could simply return a File object once recording has been stopped > then we've saved an exception amount of complexity from MediaStream > recording (by not having to implement media buffers for ongoing > recording data and not having to rely on the draft Stream API proposal > which offers a lot of the functionality already available in FileReader > - albeit in real-time). > > We wouldn't lose any of the ability to subsequently apply any > modifications at the byte-level (via FileReader) - just that we wouldn't > have real-time access to ongoing media recording data. > > I could live with this - unless there are some compelling use cases for > reading ongoing MediaStream data in real-time as opposed to simply being > able to read that data once it has already been collected in to a > recording, in its entirety. > > Any use cases brought forward here requiring real-time access to ongoing > recorded byte-data would be welcome. Otherwise, I'm in favor of greatly > reducing the complexity involved with recording a MediaStream. > > [1] http://www.w3.org/TR/FileAPI/#dfn-file > > [2] http://www.w3.org/TR/FileAPI/#dfn-filereader > > -- > Rich Tibbett (richt) > CORE Platform Architect - Opera Software ASA >
Received on Thursday, 23 August 2012 15:46:19 UTC