[IndexedDB] Event on commits (WAS: Proposal for async API changes) from Jeremy Orlow on 2010-06-10 (public-webapps@w3.org from April to June 2010)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Thu, 10 Jun 2010 13:39:46 +0100
To: Mikeal Rogers <mikeal.rogers@gmail.com>
Cc: Webapps WG <public-webapps@w3.org>
Message-ID: <AANLkTinmg4HBo4Wxg_-cPKaebGJmEQHpJRYR03_U-pNJ@mail.gmail.com>
Splitting into its own thread since this isn't really connected to the new
Async interface and that thread is already pretty big.

On Wed, Jun 9, 2010 at 10:36 PM, Mikeal Rogers <mikeal.rogers@gmail.com>wrote:

> I've been looking through the current spec and all the proposed changes.
>
> Great work. I'm going to be building a CouchDB compatible API on top
> of IndexedDB that can support peer-to-peer replication without other
> CouchDB instances.
>
> One of the things that will entail is a by-sequence index for all the
> changes in a give "database" (in my case a database will be scoped to
> more than one ObjectStore). In order to accomplish this I'll need to
> keep the last known sequence around so that each new write can create
> a new entry in the by-sequence index. The problem is that if another
> tab/window writes to the database it'll increment that sequence and I
> won't be notified so I would have to start every transaction with a
> check on the sequence index for the last sequence which seems like a
> lot of extra cursor calls.
>

It would be a lot of extra calls, but I'm a bit hesitant to add much more
API surface area to v1, and the fall back plan doesn't seem too
unreasonable.


> What I really need is an event listener on an ObjectStore that fires
> after a transaction is committed to the store but before the next
> transaction is run that gives me information about the commits to the
> ObjectStore.
>
> Thoughts?
>

To do this, we could specify an
IndexedDatabaseRequest.ontransactioncommitted event that would
be guaranteed to fire after every commit and before we started the next
transaction.  I think that'd meet your needs and not add too much additional
surface area...  What do others think?

-Mikeal
>
> On Wed, Jun 9, 2010 at 11:40 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
> > On Wed, Jun 9, 2010 at 7:25 PM, Jonas Sicking <jonas@sicking.cc> wrote:
> >>
> >> On Wed, Jun 9, 2010 at 7:42 AM, Jeremy Orlow <jorlow@chromium.org>
> wrote:
> >> > On Tue, May 18, 2010 at 8:34 PM, Jonas Sicking <jonas@sicking.cc>
> wrote:
> >> >>
> >> >> On Tue, May 18, 2010 at 12:10 PM, Jeremy Orlow <jorlow@chromium.org>
> >> >> wrote:
> >> >> > I'm not sure I like the idea of offering sync cursors either since
> >> >> > the
> >> >> > UA
> >> >> > will either need to load everything into memory before starting or
> >> >> > risk
> >> >> > blocking on disk IO for large data sets.  Thus I'm not sure I
> support
> >> >> > the
> >> >> > idea of synchronous cursors.  But, at the same time, I'm concerned
> >> >> > about
> >> >> > the
> >> >> > overhead of firing one event per value with async cursors.  Which
> is
> >> >> > why I
> >> >> > was suggesting an interface where the common case (the data is in
> >> >> > memory) is
> >> >> > done synchronously but the uncommon case (we'd block if we had to
> >> >> > respond
> >> >> > synchronously) has to be handled since we guarantee that the first
> >> >> > time
> >> >> > will
> >> >> > be forced to be asynchronous.
> >> >> > Like I said, I'm not super happy with what I proposed, but I think
> >> >> > some
> >> >> > hybrid async/sync interface is really what we need.  Have you guys
> >> >> > spent
> >> >> > any
> >> >> > time thinking about something like this?  How dead-set are you on
> >> >> > synchronous cursors?
> >> >>
> >> >> The idea is that synchronous cursors load all the required data into
> >> >> memory, yes. I think it would help authors a lot to be able to load
> >> >> small chunks of data into memory and read and write to it
> >> >> synchronously. Dealing with asynchronous operations constantly is
> >> >> certainly possible, but a bit of a pain for authors.
> >> >>
> >> >> I don't think we should obsess too much about not keeping things in
> >> >> memory, we already have things like canvas and the DOM which adds up
> >> >> to non-trivial amounts of memory.
> >> >>
> >> >> Just because data is loaded from a database doesn't mean it's huge.
> >> >>
> >> >> I do note that you're not as concerned about getAll(), which actually
> >> >> have worse memory characteristics than synchronous cursors since you
> >> >> need to create the full JS object graph in memory.
> >> >
> >> > I've been thinking about this off and on since the original proposal
> was
> >> > made, and I just don't feel right about getAll() or synchronous
> cursors.
> >> >  You make some good points about there already being many ways to
> >> > overwhelm
> >> > ram with webAPIs, but is there any place we make it so easy?  You're
> >> > right
> >> > that just because it's a database doesn't mean it needs to be huge,
> but
> >> > often times they can get quite big.  And if a developer doesn't spend
> >> > time
> >> > making sure they test their app with the upper ends of what users may
> >> > possibly see, it just seems like this is a recipe for problems.
> >> > Here's a concrete example: structured clone allows you to store image
> >> > data.
> >> >  Lets say I'm building an image hosting site and that I cache all the
> >> > images
> >> > along with their thumbnails locally in an IndexedDB entity store.
>  Lets
> >> > say
> >> > each thumbnail is a trivial amount, but each image is 1MB.  I have an
> >> > album
> >> > with 1000 images.  I do |var photos =
> >> > albumIndex.getAllObjects(albumName);|
> >> > and then iterate over that to get the thumbnails.  But I've just
> loaded
> >> > over
> >> > 1GB of stuff into ram (assuming no additional inefficiency/blowup).  I
> >> > suppose it's possible JavaScript engines could build mechanisms to
> fetch
> >> > this stuff lazily (like you could even with a synchronous cursor) but
> >> > that
> >> > will take time/effort and introduce lag in the page (while fetching
> >> > additional info from disk).
> >> >
> >> > I'm not completely against the idea of getAll/sync cursors, but I do
> >> > think
> >> > they should be de-coupled from this proposed API.  I would also
> suggest
> >> > that
> >> > we re-consider them only after at least one implementation has normal
> >> > cursors working and there's been some experimentation with it.  Until
> >> > then,
> >> > we're basing most of our arguments on intuition and assumptions.
> >>
> >> I'm not married to the concept of sync cursors. However I pretty
> >> strongly feel that getAll is something we need. If we just allow
> >> cursors for getting multiple results I think we'll see an extremely
> >> common pattern of people using a cursor to loop through a result set
> >> and put values into an array.
> >>
> >> Yes, it can be misused, but I don't see a reason why people wouldn't
> >> misuse a cursor just as much. If they don't think about the fact that
> >> a range contains lots of data when using getAll, why would they think
> >> about it when using cursors?
> >
> > Once again, I feel like there is a lot of speculation (more than normal)
> > happening here.  I'd prefer we take the Async API without the sync
> cursors
> > or getAll and give the rest of the API some time to bake before
> considering
> > it again.  Ideally by then we'd have at least one or two early adopters
> that
> > can give their perspective on the issue.
> > J
>
>
>
Received on Thursday, 10 June 2010 12:40:38 UTC