Re: Replacing WebSQL with a Relational Data Model. from Keean Schupke on 2010-10-26 (public-webapps@w3.org from October to December 2010)

From: Keean Schupke <keean@fry-it.com>
Date: Tue, 26 Oct 2010 22:41:40 +0100
To: Jeremy Orlow <jorlow@chromium.org>
Cc: "Tab Atkins Jr." <jackalmage@gmail.com>, nathan@webr3.org, public-webapps@w3.org, Jonas Sicking <jonas@sicking.cc>, Arthur Barstow <art.barstow@nokia.com>
Message-ID: <AANLkTikx+EdPs9HreRk=Y1WHz-GyTf=ZB97XanwL2OYc@mail.gmail.com>
Fractal Tree is infact a trade-mark of Tokutek, but here is the original
research before they came up with the name:

http://people.csail.mit.edu/jfineman/sbtree.pdf

Where they refer to them as "Shuttle Trees" or Cache-oblivious streaming
b-trees.

Cache oblivious data structures and algorithms are an interesting area, as
it can be shown that what we think are the fastest algorithms (quick-sort,
b-tree etc...) are not due to the action of the cache. Cache aware
algorithms try and optimise for a specific cache size. Cache oblivious
algorithms optimise for _all_ cache sizes. The cache-oblivious sort (that is
faster on real hardware than a quick-sort) is called a "funnel sort".


Cheers,
Keean.


On 26 October 2010 21:06, Jeremy Orlow <jorlow@chromium.org> wrote:

> On Tue, Oct 26, 2010 at 8:04 PM, Keean Schupke <keean@fry-it.com> wrote:
>
>> Hi,
>>
>>
>>> Likewise, personally all data I use is either in graph or k/v format, and
>>> have found (like many others), that primary speed gains come from changing
>>> the underlying data model rather than trying to map forwards and backwards
>>> between objects, graphs and tabular data in a relational data model, via
>>> SQL.
>>>
>>>
>>
>> If we see HTML5 as a replacement for native mobile apps, and desktop apps,
>> then we have to look at what applications require in terms of data storage
>> not websites. SQL is very widely used, and that must tell us something.
>> Nearly all our apps use it.
>>
>> Lets think about what is currently done on the server. Lots of server apps
>> use a relational database. With HTML5 we would like to move as much of that
>> work as possible to the client - so that it uses the client's CPU not the
>> servers CPU. The HTML5 app will keep a local cache of some of the server
>> database for offline use. For example a calendar app that keeps the next
>> months appointments in the client, and allows new meetings to be created
>> offline. There may be many ways you wish to seach and cross reference the
>> information, for example you may have a contacts table and an appointments
>> table. You may use a join to allow the contacts phone number to be displayed
>> in each appointment. Searching, sorting, grouping, all things you will want
>> to do.
>>
>
> On the other hand, what's the first thing most app developers do that use
> SQL as a backend?  They use some library that wraps SQL and makes it more
> object oriented.  IndexedDB skips this step.  There are definitely pros and
> cons to both models.
>
> The problem is the proposed data model is a subset of the functionality
>> required. The relational model is a complete algebra (hence the term
>> relationally complete). With a non-relationally complete storage API there
>> will always be limitations to what it can do.
>>
>
> You can implement a relational model on top of IndexedDB, so I'm not sure
> what limitations you're talking about except performance.  And this is
> something we fully expect we'll need to work on for some time, but I don't
> see any fundamental reason performance issues won't be solved since we can
> add to the API as we need (while keeping the surface area as small as is
> practical!).
>
> What I would like to see is the relational API standardised (maybe in the
>> next version of IndexedDB, maybe as a separate standard), so there is a
>> relationally complete API available, that browser implementers can implement
>> using a relational storage engine like SQLite directly, or can be emulated
>> by an implementation on top of IndexedDB where necessary.
>>
>
> We will likely implement something, but it'll hopefully be fairly light
> weight.  We shouldn't get too ahead of ourselves in speculating about
> optimizations until people actually start building real world apps against
> IndexedDB though.  I'm hopeful this will start soon as both Mozilla and
> WebKit/Chromium's implementations are becoming closer and closer to fully
> implementing the spec.
>
>
> On Tue, Oct 26, 2010 at 8:54 PM, Tab Atkins Jr. <jackalmage@gmail.com>wrote:
>
>> On Tue, Oct 26, 2010 at 12:04 PM, Keean Schupke <keean@fry-it.com> wrote:
>> > Take Firefox for example, it implements IndexedDB using SQLite
>> apparently.
>> > So implementing a relational API if we have to talk to IndexedDB that
>> means
>> > we have to convert from the relational data model to an object model and
>> > then back to a relational model for SQLite. So what I would like to do
>> is
>> > punch through that excess layer in the middle and have the relational
>> API
>> > talk directly to SQLite in the browser implementation. How could you
>> argue
>> > that having an unnecessary middle layer is a good thing?
>>
>> The SQLite back-end used by Firefox's implementation of IndexedDB (and
>> Chrome's, for the moment) is unnecessary; at least in Chrome's case,
>> we used a SQLite backend only because it was expedient and the code
>> was there.  We'll be changing it to a better backend in the future,
>> and I suspect that Firefox will do the same in time.
>>
>> The middle layer isn't unnecessary, *it's the whole point*.  The
>> back-end shouldn't ever be exposed directly - you don't want your code
>> to break if we drop the SQLite backend and switch to a direct
>> b-tree-based backend.
>>
>
> ...or a fractal tree.  :-P  (Still would love pointers to open
> source implementations.)
>
> J
>
Received on Tuesday, 26 October 2010 21:42:13 UTC