Re: XMLHttpRequest.responseBlob from Darin Fisher on 2010-04-29 (public-webapps@w3.org from April to June 2010)

From: Darin Fisher <darin@chromium.org>
Date: Thu, 29 Apr 2010 15:04:55 -0700
To: Eric Uhrhane <ericu@google.com>
Cc: Michael Nordman <michaeln@google.com>, Jonas Sicking <jonas@sicking.cc>, Web Applications Working Group WG <public-webapps@w3.org>
Message-ID: <z2zbd8f24d21004291504g1eab65b0hf915bd60d971f21b@mail.gmail.com>
On Wed, Apr 28, 2010 at 2:30 PM, Eric Uhrhane <ericu@google.com> wrote:

> On Wed, Apr 28, 2010 at 12:45 PM, Darin Fisher <darin@chromium.org> wrote:
> > On Wed, Apr 28, 2010 at 11:57 AM, Michael Nordman <michaeln@google.com>
> > wrote:
> >>
> >>
> >> On Wed, Apr 28, 2010 at 11:21 AM, Jonas Sicking <jonas@sicking.cc>
> wrote:
> >>>
> >>> Ugh, sent this originally to just Darin. Resending to the list.
> >>>
> >>> On Wed, Apr 28, 2010 at 10:11 AM, Darin Fisher <darin@chromium.org>
> >>> wrote:
> >>> > On Tue, Apr 27, 2010 at 2:04 PM, Jonas Sicking <jonas@sicking.cc>
> >>> > wrote:
> >>> >>
> >>> >> On Tue, Apr 27, 2010 at 1:59 PM, Darin Fisher <darin@chromium.org>
> >>> >> wrote:
> >>> >> > On Tue, Apr 27, 2010 at 1:33 PM, Jonas Sicking <jonas@sicking.cc>
> >>> >> > wrote:
> >>> >> >>
> >>> >> >> On Tue, Apr 27, 2010 at 1:26 PM, Darin Fisher <
> darin@chromium.org>
> >>> >> >> wrote:
> >>> >> >> >> It would be nice to be able to allow streaming such that every
> >>> >> >> >> time
> >>> >> >> >> a
> >>> >> >> >> progress event is fired only the newly downloaded data is
> >>> >> >> >> available.
> >>> >> >> >> The UA is then free to throw away that data once the event is
> >>> >> >> >> done
> >>> >> >> >> firing. This would be useful in the cases when the page is
> able
> >>> >> >> >> to
> >>> >> >> >> do
> >>> >> >> >> incremental parsing of the resulting document.
> >>> >> >> >>
> >>> >> >> >> If we add a 'load mode' flag on XMLHttpRequest, which can't be
> >>> >> >> >> modified after send() is called, then streaming to a Blob
> could
> >>> >> >> >> simply
> >>> >> >> >> be another enum value for such a flag.
> >>> >> >> >>
> >>> >> >> >> There is still the problem of how the actual blob works. I.e.
> >>> >> >> >> does
> >>> >> >> >> .responseBlob return a new blob every time more data is
> >>> >> >> >> returned? Or
> >>> >> >> >> should the same Blob be constantly modifying? If modifying,
> what
> >>> >> >> >> happens to any in-progress reads when the file is modified? Or
> >>> >> >> >> do
> >>> >> >> >> you
> >>> >> >> >> just make the Blob available once the whole resource has been
> >>> >> >> >> downloaded?
> >>> >> >> >>
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > This is why I suggested using FileWriter.  FileWriter already
> has
> >>> >> >> > to
> >>> >> >> > deal with
> >>> >> >> > most of the problems you mentioned above,
> >>> >> >>
> >>> >> >> Actually, as far as I can tell FileWriter is write-only so it
> >>> >> >> doesn't
> >>> >> >> deal with any of the problems above.
> >>> >> >
> >>> >> > When you use createWriter, you are creating a FileWriter to an
> >>> >> > existing
> >>> >> > File.
> >>> >> > The user could attempt to create a FileReader to the very same
> File
> >>> >> > while
> >>> >> > a FileWriter is open to it.
> >>> >> > It is true that for <input type=saveas> there is no way to get at
> >>> >> > the
> >>> >> > underlying
> >>> >> > File object.  That is perhaps a good thing for the use case of
> >>> >> > downloading
> >>> >> > to
> >>> >> > a location specified by the user.
> >>> >>
> >>> >> Ah. But as far as I can tell (and remember), it's still fairly
> >>> >> undefined what happens when the OS file under a File/Blob object is
> >>> >> mutated.
> >>> >>
> >>> >> / Jonas
> >>> >
> >>> > Agreed.  I don't see it as a big problem.  Do you?  The application
> >>> > developer is
> >>> > in control.  They get to specify the output file (via FileWriter)
> that
> >>> > XHR
> >>> > sends its
> >>> > output to, and they get to know when XHR is done writing.  So, the
> >>> > application
> >>> > developer can avoid reading from the file until XHR is done writing.
> >>>
> >>> Well, it seems like a bigger deal here since the file is being
> >>> constantly modified as we're downloading data into it, no? So for
> >>> example if you grab a File object after the first progress event, what
> >>> does that File object contain after the second? Does it contain the
> >>> whole file, including the newly downloaded data? Or does it contain
> >>> only the data after the first progress event? Or is the File object
> >>> now invalid and can't be used?
> >>
> >> What gears did about that was to provide a 'snapshot' of the
> >> downloaded data each time responseBlob was called, with
> >> the 'snapshot' being consistent with the progress events
> >> having been seen by the caller. The 'snapshot' would remain
> >> valid until discarded by the caller. Each snapshot just provided
> >> a view onto the same data which maybe was in memory or
> >> maybe had spilled over to disk unbeknownst to the caller.
> >>
> >>>
> >>> I'm also still unsure that a FileWriter is what you want generally. If
> >>> you're just downloading temporary data, but data that happens to be so
> >>> large that you don't want to keep it in memory, you don't want to
> >>> bother the user asking for a location for that temporary file. Nor do
> >>> you want that file to be around once the user leaves the page.
> >>
> >>
> >> I think the point about not requiring the caller to manage the 'file'
> are
> >> important.
> >>
> >>>
> >>> Sure, if the use case is actually downloading and saving a file for
> >>> the user to use, rather than for the page to use, then a FileWriter
> >>> seems like it would work. I.e. if you want something like
> >>> "Content-Disposition: attachment", but where you can specify request
> >>> headers. Is that the use case?
> >>
> >> Mods to xhr to access the response more opaquely is a fairly general
> >> feature request. One specific use case is to download a resource via xhr
> >> and then save the results in a sandboxed file system. So "for the page
> to
> >> use".
> >
> > ^^^ That is the use case I'm primarily interested in.
>
> I think there are a couple of important use cases here, and FileWriter
> really only works for one of them.  It would work fine for a sandboxed
> filesystem, as you say.  However, if you just want to get a chunk of
> binary data from the server, and don't want to manage its lifetime [or
> don't have permission to use the filesystem API, or are on a browser
> that doesn't support it], this won't work.
>

My thinking was that we would still have the responseBody getter that
makes available a ByteArray object.



>
> If we just present a File or Blob to the user, they can get access to
> the data without worrying about where it's stored, whether it's in
> memory or on disk, and without having to clean it up or get any kind
> of permission.  If they want to copy it into their sandboxed
> filesystem, they can do that using the filesystem API.
>

That copy step seems suboptimal for large files.  Can we eliminate it?

-Darin



>
> > I think it would be beneficial if downloading to disk was not
> rate-limited
> > by routing chunks through JS.
>
> +1, although the streaming API might be a nice addition.
>
> > I don't care as much about downloading to a user specified location, but
> I
> > think that's an interesting use case as well.  XHR gives the app more
> > flexibility (custom headers, cross-origin, etc.), and FileWriter allows
> the
> > app to "save as" URLs that do not have a C-D header that forces a
> download.
> >
> >>
> >> The notion of having a streaming interface on xhr is interesting. That
> >> with a
> >> BlobBuilder capability could work. If a streaming xhr mode provided new
> >> data in the form of 'blobs' where each blob was just the newly received
> >> data,
> >> the caller could use a BlobBuilder instance to concatenate the set of
> >> received
> >> data blobs. And then take blobBuilder.getBlob() and do what they will
> with
> >> it.   xhr.ondatareceived = function (data) { builder.appendBlob(data); }
> >
> > ^^^ I like that proposal for streaming.
> > -Darin
>
Received on Thursday, 29 April 2010 22:05:35 UTC