- From: Maciej Stachowiak <mjs@apple.com>
- Date: Sun, 11 May 2008 18:46:31 -0700
- To: Aaron Boodman <aa@google.com>
- Cc: Chris Prince <cprince@google.com>, "Web API WG (public)" <public-webapi@w3.org>, Ian Hickson <ian@hixie.ch>
On May 11, 2008, at 6:01 PM, Aaron Boodman wrote: > On Sun, May 11, 2008 at 5:46 PM, Maciej Stachowiak <mjs@apple.com> > wrote: >> Well, that depends on how good the OS buffer cache is at >> prefetching. But in >> general, there would be some disk access. > > It seems better if the read API is just async for this case to prevent > the problem. It can't entirely prevent the problem. If you read a big enough chunk, it will cause swapping which hits the disk just as much as file reads. Possibly more, because real file access will trigger OS prefetch heuristics for linear access. >>> I see what you mean for canvas, but not so much for XHR. It seems >>> like >>> a valid use case to want to be able to use XHR to download very >>> large >>> files. In that case, the thing you get back seems like it should >>> have >>> an async API for reading. >> >> Hmm? If you get the data over the network it goes into RAM. Why >> would you >> want an async API to in-memory data? Or are you suggesting XHR >> should be >> changed to spool its data to disk? I do not think that is practical >> to do >> for all requests, so this would have to be a special API mode for >> responses >> that are expected to be too big to fit in memory. > > Whether XHR spools to disk is an implementation detail, right? Right > now XHR is not practical to use for downloading large files because > the only way to access the result is as a string. Also because of > this, XHR implementations don't bother spooling to disk. But if this > API were added, then XHR implementations could be modified to start > spooling to disk if the response got large. If the caller requests > responseText, then the implementation just does the best it can to > read the whole thing into a string and reply. But if the caller uses > responseBlob (or whatever we call it) then it becomes practical to, > for example, download movie files, modify them, then re-upload them. That sounds reasonable for very large files like movies. However, audio and image files are similar in size to the kinds of text or XML resources that are currently processed synchronously. In such cases they are likely to remain in memory. In general it is sounding like it might be desirable to have at least two kinds of objects for representing binary data: 1) An in-memory, mutable representation with synchronous access. There should also be a copying API which is possibly copy-on-write for the backing store. 2) A possibly disk-backed representation that offers only asynchronous read (possibly in the form of representation #1). Both representations could be used with APIs that can accept binary data. In most cases such APIs only take strings currently. The name of representation #2 may wish to tie it to being a file, since for anything already in memory you'd want representation #1. Perhaps they could be called ByteArray and File respectively. Open question: can a File be stored in a SQL database? If so, does the database store the data or a reference (such as a path or Mac OS X Alias)? Regards, Maciej
Received on Monday, 12 May 2008 01:47:11 UTC