[whatwg] File API Streaming Blobs

Sorry about the very slow response; I've been on leave, and am now
catching up on my email.

On Wed, Jun 22, 2011 at 11:54 AM, Arun Ranganathan <arun at mozilla.com> wrote:
> Greetings Adam,
>
>> Ian, I wish I knew that earlier when I originally posted the idea,
>> there was lots of discussion and good ideas but then it suddenly
>> dropped of the face of the earth. Essentially I am fowarding this
>> suggestion to public-webapps at w3.org on the basis as apparently most
>> discussion of File API specs happen there, and would like to know how
>> to move forward with this suggestion.
>>
>> The original suggestion and following comments are on the whatwg list
>> archive, starting with
>>
>> <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-January/029973.html>
>>
>> Summing up, the problem with the current implementation of Blobs is
>> that once a URI has been generated for them, by design changes are no
>> longer reflected in the object URL. In a streaming scenario, this is
>> not what is needed, rather a long-living Blob that can be appended is
>> needed and 'streamed' to other parts of the browser, e.g. the<video>
>> or<audio> ?element.
>> The original use case was: ?make an application which will download
>> media files from a server and cache them locally, as well as playing
>> them without making the user wait for the entire file to be
>> downloaded, converted to a blob, then saved and played, however such
>> an API covers many other use cases such as on-the-fly on-device
>> decryption of streamed media content (ie live streams either without
>> end or static large files that to download completely would be a waste
>> when only the first couple of seconds need to be buffered and
>> decrypted before playback can begin)
>>
>> Some suggestions were to modify or create a new type of Blob, the
>> StreamingBlob which can be changed without its object url changing and
>> appended to as new data is downloaded or decoded, and using a similar
>> process to how large files may start to be decoded/played by a browser
>> before they are fully downloaded. Other suggestions suggested using a
>> pull API on the Blob so browsers can request for new data
>> asynchronously, such as in
>>
>> <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-January/029998.html>
>>
>> Some problems however that a browser may face is what to do with urls
>> which are opened twice, and whether the object url should start from
>> the beginning (which would be needed for decoding encrypted, on-demand
>> audio) or start from the end (similar to `tail`, for live streaming
>> events that need decryption, etc.).
>>
>> Thanks,
>> P.S. Sorry if I've not done this the right way by forwarding like
>> this, I'm not usually active on mailing lists.
>>
>>
>
> I actually think moving to a streaming mode for file reads in general is
> desirable, but I'm not entirely sure extending Blobs is the way to go for
> *that* use case, which honestly is the main use case I'm interested in. ?We
> may improve upon ideas after this API goes to Last Call for streaming file
> reads; hopefully we'll do a better job than other non-JavaScript APIs out
> there :) [1]. ?Blob objects as they are currently specified live "in memory"
> and represent "in memory" File objects as well. ?A change to the underlying
> file isn't captured in the Blob snapshot; moreover, if the file moves or is
> no longer present at time of read, an error event is fired while processing
> a read operation. ?The object URL may be dereferenced, but will result in a
> 404.
>
> The Streaming API explored by WHATWG uses the Object URL scheme for
> videoconferencing use cases [2], and so the scheme itself is suitable for
> "resources" that are more dynamic than memory-resident Blob objects.
> ?Segment-plays/segment dereferencing in general can be handled through media
> fragments; the scheme can naturally be accompanied by fragment identifiers.
>
> I agree that it may be desirable to extend Blobs to do a few other things in
> general, maybe independent of better file reads. ?You've Cc'd the right
> listserv :) ?I'd be interested in what Eric has to say, since BlobBuilder
> evolves under his watch.

Having reviewed the threads, I'm not absolutely sure that we want to
add this stuff to Blob.  It seems like streaming is quite a bit
different than a lot of the problems people want to solve with Blobs,
and we may end up with a bit of a mess if we mash them together.
BlobBuilder does seem a decent match as a StreamBuilder, though.
Since Blobs are specifically non-mutable, it sounds like what you're
looking for is more like createObjectURL(blobBuilder) than
createObjectURL(blobBuildler.getBlob()).

>From the threads and from my head, here are some questions:

1) Would reading from a stream always start at the beginning, or would
it start at the "current" point [e.g. in a live video stream]?
2) Would this have to support infinite streams?
3) Would we be expected to keep around data from the very beginning of
a stream, even if e.g. it's a live broadcast and you're now watching
hour 7?  If not, who controls the buffer size and what's the API for
data lifetime?
4) Should there be a way to get a stream from XHR directly?  [It's
probably not sufficient for all uses or necessary for any.]
5) Do we want a pull API such as Glenn suggests at
[http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-January/029998.html],
instead of using BlobBuilder at all?

I haven't fully absorbed the MediaStream API, but perhaps it would be
more natural to make a connector in that API rather than modifying
Blob?

> -- A*
>
> [1]
> http://download.oracle.com/javase/1.4.2/docs/api/java/io/FileInputStream.html
> [2] http://www.whatwg.org/specs/web-apps/current-work/#stream-api
>

Received on Monday, 8 August 2011 13:17:20 UTC