Re: IndexedDB: Thoughts on implementing IndexedDB from Joshua Bell on 2013-07-30 (public-webapps@w3.org from July to September 2013)

From: Joshua Bell <jsbell@google.com>
Date: Tue, 30 Jul 2013 15:13:59 -0700
To: Austin William Wright <aaa@bzfx.net>
Cc: "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <CAD649j5x3WZoHwjT0TBE1hQUrAouNFyoMkLHO2vxv0vSLc1_4Q@mail.gmail.com>
And now replying to the non-nits:


On Tue, Jul 30, 2013 at 1:30 AM, Austin William Wright <aaa@bzfx.net> wrote:

> I've been meaning to implement IndexedDB in some fashion for a while.
> Earlier this month, shortly after the call for implementations, I realized
> I should be getting on that. I've been working on an in-memory ECMAScript
> implementation with fast data structures and the like. I also intend to
> experiment with new features like new types of indexes (hash tables that
> can't be iterated, and index values calculated by expression/function,
> which appears to have been discussed elsewhere).
>
> I've had a few thoughts, mostly about language:
>
> (1) Is there no way to specify an arbitrary nested path? I want to do
> something like ['menus', x] where `x` is some token which may be anything,
> like an empty string or a string with a period in it. This is especially
> important if there are structures like {"http://example.com/URI":
> "value"} in documents, which is especially common in JSON-LD. From what I
> can tell, IndexedDB essentially makes it impossible to index JSON-LD
> documents.
>
> It appears the current behavior instead allows you to index by multiple
> keys, but it's not immediately obvious this is the rationale.
>
> How *would* one include a property whose key includes a period? This seems
> to be asking for security problems, if authors need to implement an
> escaping scheme for their keys, either when constructing a key path or when
> constructing objects. Database names can be anything, why not key names?
>
>
The key path mechanism (and by definition, the index mechanism) definitely
doesn't support every use case. It is focused on the "simple" case where
the structure of the data being stored is under the control of the
developer authoring code against the IDB API. Slap a library in the middle
that's exposing a radically different storage API to authors and that
library is going to need to compute index keys on its own and produce
wrapper objects, or some such.

One of the ideas that's been talked about for "v2" is extensible indexing,
allowing the index key to be computed by a script function.


> (3) I had trouble deciphering the exact behavior of multiple open
> transactions on one another. I eventually realized the definition of
> IDBTransactionMode describes the behavior.
>
> Still, however, this document appears to talk in terms of what is "written
> to the database". But this isn't well defined. If something is written to
> the database, wouldn't it affect what is read in a readonly transaction?
> (No.)
>
> And the language seems inconsistent. The language for `abort` says that
> changes to the database must be "rolled back" (as if every operation writes
> to storage), but the language for `Steps for committing a transaction`
> specifies it is at that time the data is written (as if all write
> operations up to this point are kept in memory). There's not strictly a
> contradiction here, but perhaps more neutral language could be used.
>
>
Agreed, this could be improved. (Practically speaking, I expect that would
happen if we end up with implementation differences that require refining
the language in a future iteration.)


> (5) I found the language for iterating and creating a Cursor hard to
> understand being nested in multiple layers of algorithms. Specifically,
> where an IDBCursor instance was actually exposed to the user. But now it
> makes sense, and I don't really see how it might be improved. An
> (informative) example on iterating a cursor may be helpful.
>
>
I recently added one towards the start of the spec ("The following example
looks up all books in the database by author using an index and a cursor")
- is that what you were thinking? Is it just a matter of spec organization?
I think at some point in the spec history the examples were more integrated
into the text.


> (6) The document refers to the HTML5 Structured Clone Algorithm. It's a
> bit concerning that it has to refer to ECMAScript algorithms defined in a
> specification that defines a markup language. I don't think referring to a
> markup language should be necessary (I don't intend on using my
> implementation in an (X)HTML environment, just straight XML if anything at
> all), though perhaps this is just a modularity problem with the HTML5 draft
> (or rather, lack thereof).
>

Agreed that it seems like an odd place for it in the abstract, but the HTML
spec defines much of the behavior of the browser environment beyond the
markup language. Hixie and Anne are doing some spec refactoring work;
perhaps some day it will be more modular. Indexed DB is very much designed
to be an API for scripts running in Web browsers, though.


>
> Finally, is there a good test suite? I can't seem to find anything in the
> way of regression tests. I'll perhaps publish my own, if not.
>
>
> Austin Wright.
>


More tests welcome! The w3c has a test repo that Art has linked to in a
fork of this thread. Blink's tests are here:
http://src.chromium.org/viewvc/blink/trunk/LayoutTests/storage/indexeddb/
Received on Tuesday, 30 July 2013 22:14:27 UTC