- From: Eric Uhrhane <ericu@google.com>
- Date: Thu, 29 Apr 2010 15:46:42 -0700
- To: Darin Fisher <darin@chromium.org>
- Cc: Michael Nordman <michaeln@google.com>, Jonas Sicking <jonas@sicking.cc>, Web Applications Working Group WG <public-webapps@w3.org>
On Thu, Apr 29, 2010 at 3:35 PM, Darin Fisher <darin@chromium.org> wrote: > On Thu, Apr 29, 2010 at 3:24 PM, Eric Uhrhane <ericu@google.com> wrote: >> >> On Thu, Apr 29, 2010 at 3:04 PM, Darin Fisher <darin@chromium.org> wrote: >> > >> > >> > On Wed, Apr 28, 2010 at 2:30 PM, Eric Uhrhane <ericu@google.com> wrote: >> >> >> >> On Wed, Apr 28, 2010 at 12:45 PM, Darin Fisher <darin@chromium.org> >> >> wrote: >> >> > On Wed, Apr 28, 2010 at 11:57 AM, Michael Nordman >> >> > <michaeln@google.com> >> >> > wrote: >> >> >> >> >> >> >> >> >> On Wed, Apr 28, 2010 at 11:21 AM, Jonas Sicking <jonas@sicking.cc> >> >> >> wrote: >> >> >>> >> >> >>> Ugh, sent this originally to just Darin. Resending to the list. >> >> >>> >> >> >>> On Wed, Apr 28, 2010 at 10:11 AM, Darin Fisher <darin@chromium.org> >> >> >>> wrote: >> >> >>> > On Tue, Apr 27, 2010 at 2:04 PM, Jonas Sicking <jonas@sicking.cc> >> >> >>> > wrote: >> >> >>> >> >> >> >>> >> On Tue, Apr 27, 2010 at 1:59 PM, Darin Fisher >> >> >>> >> <darin@chromium.org> >> >> >>> >> wrote: >> >> >>> >> > On Tue, Apr 27, 2010 at 1:33 PM, Jonas Sicking >> >> >>> >> > <jonas@sicking.cc> >> >> >>> >> > wrote: >> >> >>> >> >> >> >> >>> >> >> On Tue, Apr 27, 2010 at 1:26 PM, Darin Fisher >> >> >>> >> >> <darin@chromium.org> >> >> >>> >> >> wrote: >> >> >>> >> >> >> It would be nice to be able to allow streaming such that >> >> >>> >> >> >> every >> >> >>> >> >> >> time >> >> >>> >> >> >> a >> >> >>> >> >> >> progress event is fired only the newly downloaded data is >> >> >>> >> >> >> available. >> >> >>> >> >> >> The UA is then free to throw away that data once the event >> >> >>> >> >> >> is >> >> >>> >> >> >> done >> >> >>> >> >> >> firing. This would be useful in the cases when the page is >> >> >>> >> >> >> able >> >> >>> >> >> >> to >> >> >>> >> >> >> do >> >> >>> >> >> >> incremental parsing of the resulting document. >> >> >>> >> >> >> >> >> >>> >> >> >> If we add a 'load mode' flag on XMLHttpRequest, which >> >> >>> >> >> >> can't >> >> >>> >> >> >> be >> >> >>> >> >> >> modified after send() is called, then streaming to a Blob >> >> >>> >> >> >> could >> >> >>> >> >> >> simply >> >> >>> >> >> >> be another enum value for such a flag. >> >> >>> >> >> >> >> >> >>> >> >> >> There is still the problem of how the actual blob works. >> >> >>> >> >> >> I.e. >> >> >>> >> >> >> does >> >> >>> >> >> >> .responseBlob return a new blob every time more data is >> >> >>> >> >> >> returned? Or >> >> >>> >> >> >> should the same Blob be constantly modifying? If >> >> >>> >> >> >> modifying, >> >> >>> >> >> >> what >> >> >>> >> >> >> happens to any in-progress reads when the file is >> >> >>> >> >> >> modified? >> >> >>> >> >> >> Or >> >> >>> >> >> >> do >> >> >>> >> >> >> you >> >> >>> >> >> >> just make the Blob available once the whole resource has >> >> >>> >> >> >> been >> >> >>> >> >> >> downloaded? >> >> >>> >> >> >> >> >> >>> >> >> > >> >> >>> >> >> > >> >> >>> >> >> > This is why I suggested using FileWriter. FileWriter >> >> >>> >> >> > already >> >> >>> >> >> > has >> >> >>> >> >> > to >> >> >>> >> >> > deal with >> >> >>> >> >> > most of the problems you mentioned above, >> >> >>> >> >> >> >> >>> >> >> Actually, as far as I can tell FileWriter is write-only so it >> >> >>> >> >> doesn't >> >> >>> >> >> deal with any of the problems above. >> >> >>> >> > >> >> >>> >> > When you use createWriter, you are creating a FileWriter to an >> >> >>> >> > existing >> >> >>> >> > File. >> >> >>> >> > The user could attempt to create a FileReader to the very same >> >> >>> >> > File >> >> >>> >> > while >> >> >>> >> > a FileWriter is open to it. >> >> >>> >> > It is true that for <input type=saveas> there is no way to get >> >> >>> >> > at >> >> >>> >> > the >> >> >>> >> > underlying >> >> >>> >> > File object. That is perhaps a good thing for the use case of >> >> >>> >> > downloading >> >> >>> >> > to >> >> >>> >> > a location specified by the user. >> >> >>> >> >> >> >>> >> Ah. But as far as I can tell (and remember), it's still fairly >> >> >>> >> undefined what happens when the OS file under a File/Blob object >> >> >>> >> is >> >> >>> >> mutated. >> >> >>> >> >> >> >>> >> / Jonas >> >> >>> > >> >> >>> > Agreed. I don't see it as a big problem. Do you? The >> >> >>> > application >> >> >>> > developer is >> >> >>> > in control. They get to specify the output file (via FileWriter) >> >> >>> > that >> >> >>> > XHR >> >> >>> > sends its >> >> >>> > output to, and they get to know when XHR is done writing. So, >> >> >>> > the >> >> >>> > application >> >> >>> > developer can avoid reading from the file until XHR is done >> >> >>> > writing. >> >> >>> >> >> >>> Well, it seems like a bigger deal here since the file is being >> >> >>> constantly modified as we're downloading data into it, no? So for >> >> >>> example if you grab a File object after the first progress event, >> >> >>> what >> >> >>> does that File object contain after the second? Does it contain the >> >> >>> whole file, including the newly downloaded data? Or does it contain >> >> >>> only the data after the first progress event? Or is the File object >> >> >>> now invalid and can't be used? >> >> >> >> >> >> What gears did about that was to provide a 'snapshot' of the >> >> >> downloaded data each time responseBlob was called, with >> >> >> the 'snapshot' being consistent with the progress events >> >> >> having been seen by the caller. The 'snapshot' would remain >> >> >> valid until discarded by the caller. Each snapshot just provided >> >> >> a view onto the same data which maybe was in memory or >> >> >> maybe had spilled over to disk unbeknownst to the caller. >> >> >> >> >> >>> >> >> >>> I'm also still unsure that a FileWriter is what you want generally. >> >> >>> If >> >> >>> you're just downloading temporary data, but data that happens to be >> >> >>> so >> >> >>> large that you don't want to keep it in memory, you don't want to >> >> >>> bother the user asking for a location for that temporary file. Nor >> >> >>> do >> >> >>> you want that file to be around once the user leaves the page. >> >> >> >> >> >> >> >> >> I think the point about not requiring the caller to manage the >> >> >> 'file' >> >> >> are >> >> >> important. >> >> >> >> >> >>> >> >> >>> Sure, if the use case is actually downloading and saving a file for >> >> >>> the user to use, rather than for the page to use, then a FileWriter >> >> >>> seems like it would work. I.e. if you want something like >> >> >>> "Content-Disposition: attachment", but where you can specify >> >> >>> request >> >> >>> headers. Is that the use case? >> >> >> >> >> >> Mods to xhr to access the response more opaquely is a fairly general >> >> >> feature request. One specific use case is to download a resource via >> >> >> xhr >> >> >> and then save the results in a sandboxed file system. So "for the >> >> >> page >> >> >> to >> >> >> use". >> >> > >> >> > ^^^ That is the use case I'm primarily interested in. >> >> >> >> I think there are a couple of important use cases here, and FileWriter >> >> really only works for one of them. It would work fine for a sandboxed >> >> filesystem, as you say. However, if you just want to get a chunk of >> >> binary data from the server, and don't want to manage its lifetime [or >> >> don't have permission to use the filesystem API, or are on a browser >> >> that doesn't support it], this won't work. >> > >> > My thinking was that we would still have the responseBody getter that >> > makes available a ByteArray object. >> > >> >> >> >> If we just present a File or Blob to the user, they can get access to >> >> the data without worrying about where it's stored, whether it's in >> >> memory or on disk, and without having to clean it up or get any kind >> >> of permission. If they want to copy it into their sandboxed >> >> filesystem, they can do that using the filesystem API. >> > >> > That copy step seems suboptimal for large files. Can we eliminate it? >> >> In case 1, the developer just wants the data, and doesn't want to >> manage it or use the sandboxed FileSystem [1]. It's stored >> [temporarily] in some place controlled by the browser, that the app >> can't freely browse. >> In case 2, the developer wants to keep the data around in the >> FileSystem indefinitely. Once it's there, it can be opened at will. >> We want to get it there without an extra copy. >> >> Giving a FileWriter [2] to XHR doesn't handle case 1, since while that >> will store the data for you, it doesn't give you read access. The >> closest read-write primitive is FileEntry from FileSystem. If you >> grab a FileEntry from your sandbox and give it to your XHR, that would >> work for case 2. For case 1, we'd need something like mkTemp [3] that >> would create a FileEntry pointing at the downloaded file. That seems >> a bit kludgy. >> >> If XHR has a File property that you can ask for, you could either >> supply it a FileEntry before sending [in which case the File you got >> back would be that FileEntry] or it could give you a new File that >> points into the browser cache if you don't. How does that sound? >> >> Eric >> >> [1] http://dev.w3.org/2009/dap/file-system/file-dir-sys.html >> [2] http://dev.w3.org/2009/dap/file-system/file-writer.html >> [3] http://unixhelp.ed.ac.uk/CGI/man-cgi?mktemp > > It just seems to me that use case #1 can just be regarded as a > specialization > of use case #2. Given the File API's requestTemporaryFilesystem, it would > be > easy for the application to request that the file be stored in temporary > space. > They don't have to manage their temporary space, right? > The details of using the sandboxed filesystem for use case #1 could be > hidden > within a JS library thereby making it an easy solution to deploy. Yeah, that would work. It's a little more cumbersome, but as you say, a library could clean that right up. Whether that's going to make a mess or not depends on the specific implementation of requestTemporaryFilesystem. If it always gives back the same filesystem, such that it can be used for caching [the likely case], then that library is going to be dropping files into a namespace the developer's using. But a smart library will do that neatly and in a way that is easy to clean up. >> >> > I think it would be beneficial if downloading to disk was not >> >> > rate-limited >> >> > by routing chunks through JS. >> >> >> >> +1, although the streaming API might be a nice addition. >> >> >> >> > I don't care as much about downloading to a user specified location, >> >> > but >> >> > I >> >> > think that's an interesting use case as well. XHR gives the app more >> >> > flexibility (custom headers, cross-origin, etc.), and FileWriter >> >> > allows >> >> > the >> >> > app to "save as" URLs that do not have a C-D header that forces a >> >> > download. >> >> > >> >> >> >> >> >> The notion of having a streaming interface on xhr is interesting. >> >> >> That >> >> >> with a >> >> >> BlobBuilder capability could work. If a streaming xhr mode provided >> >> >> new >> >> >> data in the form of 'blobs' where each blob was just the newly >> >> >> received >> >> >> data, >> >> >> the caller could use a BlobBuilder instance to concatenate the set >> >> >> of >> >> >> received >> >> >> data blobs. And then take blobBuilder.getBlob() and do what they >> >> >> will >> >> >> with >> >> >> it. xhr.ondatareceived = function (data) { >> >> >> builder.appendBlob(data); >> >> >> } >> >> > >> >> > ^^^ I like that proposal for streaming. >> >> > -Darin >> > >> > > >
Received on Thursday, 29 April 2010 22:47:30 UTC