Re: [IndexDB] Proposal for async API changes from Jeremy Orlow on 2010-06-09 (public-webapps@w3.org from April to June 2010)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Wed, 9 Jun 2010 15:42:41 +0100
To: Jonas Sicking <jonas@sicking.cc>
Cc: Shawn Wilsher <sdwilsh@mozilla.com>, Webapps WG <public-webapps@w3.org>
Message-ID: <AANLkTilF7iOvLxr0g6KiAdom0djBfq3WZawujYjKcWbD@mail.gmail.com>

On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking <jonas@sicking.cc> wrote:

> On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow <jorlow@chromium.org>
> wrote:
> > I'm not sure I like the idea of offering sync cursors either since the UA
> > will either need to load everything into memory before starting or risk
> > blocking on disk IO for large data sets.  Thus I'm not sure I support the
> > idea of synchronous cursors.  But, at the same time, I'm concerned about
> the
> > overhead of firing one event per value with async cursors.  Which is
> why I
> > was suggesting an interface where the common case (the data is in memory)
> is
> > done synchronously but the uncommon case (we'd block if we had to respond
> > synchronously) has to be handled since we guarantee that the first time
> will
> > be forced to be asynchronous.
> > Like I said, I'm not super happy with what I proposed, but I think some
> > hybrid async/sync interface is really what we need.  Have you guys spent
> any
> > time thinking about something like this?  How dead-set are you on
> > synchronous cursors?
>
> The idea is that synchronous cursors load all the required data into
> memory, yes. I think it would help authors a lot to be able to load
> small chunks of data into memory and read and write to it
> synchronously. Dealing with asynchronous operations constantly is
> certainly possible, but a bit of a pain for authors.
>
> I don't think we should obsess too much about not keeping things in
> memory, we already have things like canvas and the DOM which adds up
> to non-trivial amounts of memory.
>
> Just because data is loaded from a database doesn't mean it's huge.
>
> I do note that you're not as concerned about getAll(), which actually
> have worse memory characteristics than synchronous cursors since you
> need to create the full JS object graph in memory.
>

I've been thinking about this off and on since the original proposal was
made, and I just don't feel right about getAll() or synchronous cursors.
 You make some good points about there already being many ways to overwhelm
ram with webAPIs, but is there any place we make it so easy?  You're right
that just because it's a database doesn't mean it needs to be huge, but
often times they can get quite big.  And if a developer doesn't spend time
making sure they test their app with the upper ends of what users may
possibly see, it just seems like this is a recipe for problems.

Here's a concrete example: structured clone allows you to store image data.
 Lets say I'm building an image hosting site and that I cache all the images
along with their thumbnails locally in an IndexedDB entity store.  Lets say
each thumbnail is a trivial amount, but each image is 1MB.  I have an album
with 1000 images.  I do |var photos = albumIndex.getAllObjects(albumName);|
and then iterate over that to get the thumbnails.  But I've just loaded over
1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
suppose it's possible JavaScript engines could build mechanisms to fetch
this stuff lazily (like you could even with a synchronous cursor) but that
will take time/effort and introduce lag in the page (while fetching
additional info from disk).

I'm not completely against the idea of getAll/sync cursors, but I do think
they should be de-coupled from this proposed API.  I would also suggest that
we re-consider them only after at least one implementation has normal
cursors working and there's been some experimentation with it.  Until then,
we're basing most of our arguments on intuition and assumptions.

J

Received on Wednesday, 9 June 2010 14:43:39 UTC