Re: Replacing WebSQL with a Relational Data Model. from Keean Schupke on 2010-10-26 (public-webapps@w3.org from October to December 2010)

From: Keean Schupke <keean@fry-it.com>
Date: Tue, 26 Oct 2010 22:46:24 +0100
To: Jeremy Orlow <jorlow@chromium.org>
Cc: "Tab Atkins Jr." <jackalmage@gmail.com>, nathan@webr3.org, public-webapps@w3.org, Jonas Sicking <jonas@sicking.cc>, Arthur Barstow <art.barstow@nokia.com>
Message-ID: <AANLkTikgwmBvWRukQ3G92k_J26HgDM+fJ3kzLw2-vJ5i@mail.gmail.com>
Actually the "fractal tree" is the COLA structure from the second part of
the paper. The description sounds right for the algorithm, and the O(log N /
B) cost per insertion also matches.


Cheers,
Keean.


On 26 October 2010 22:41, Keean Schupke <keean@fry-it.com> wrote:

> Fractal Tree is infact a trade-mark of Tokutek, but here is the original
> research before they came up with the name:
>
> http://people.csail.mit.edu/jfineman/sbtree.pdf
>
> Where they refer to them as "Shuttle Trees" or Cache-oblivious streaming
> b-trees.
>
> Cache oblivious data structures and algorithms are an interesting area, as
> it can be shown that what we think are the fastest algorithms (quick-sort,
> b-tree etc...) are not due to the action of the cache. Cache aware
> algorithms try and optimise for a specific cache size. Cache oblivious
> algorithms optimise for _all_ cache sizes. The cache-oblivious sort (that is
> faster on real hardware than a quick-sort) is called a "funnel sort".
>
>
> Cheers,
> Keean.
>
>
> On 26 October 2010 21:06, Jeremy Orlow <jorlow@chromium.org> wrote:
>
>> On Tue, Oct 26, 2010 at 8:04 PM, Keean Schupke <keean@fry-it.com> wrote:
>>
>>> Hi,
>>>
>>>
>>>> Likewise, personally all data I use is either in graph or k/v format,
>>>> and have found (like many others), that primary speed gains come from
>>>> changing the underlying data model rather than trying to map forwards and
>>>> backwards between objects, graphs and tabular data in a relational data
>>>> model, via SQL.
>>>>
>>>>
>>>
>>> If we see HTML5 as a replacement for native mobile apps, and desktop
>>> apps, then we have to look at what applications require in terms of data
>>> storage not websites. SQL is very widely used, and that must tell us
>>> something. Nearly all our apps use it.
>>>
>>> Lets think about what is currently done on the server. Lots of server
>>> apps use a relational database. With HTML5 we would like to move as much of
>>> that work as possible to the client - so that it uses the client's CPU not
>>> the servers CPU. The HTML5 app will keep a local cache of some of the server
>>> database for offline use. For example a calendar app that keeps the next
>>> months appointments in the client, and allows new meetings to be created
>>> offline. There may be many ways you wish to seach and cross reference the
>>> information, for example you may have a contacts table and an appointments
>>> table. You may use a join to allow the contacts phone number to be displayed
>>> in each appointment. Searching, sorting, grouping, all things you will want
>>> to do.
>>>
>>
>> On the other hand, what's the first thing most app developers do that use
>> SQL as a backend?  They use some library that wraps SQL and makes it more
>> object oriented.  IndexedDB skips this step.  There are definitely pros and
>> cons to both models.
>>
>> The problem is the proposed data model is a subset of the functionality
>>> required. The relational model is a complete algebra (hence the term
>>> relationally complete). With a non-relationally complete storage API there
>>> will always be limitations to what it can do.
>>>
>>
>> You can implement a relational model on top of IndexedDB, so I'm not sure
>> what limitations you're talking about except performance.  And this is
>> something we fully expect we'll need to work on for some time, but I don't
>> see any fundamental reason performance issues won't be solved since we can
>> add to the API as we need (while keeping the surface area as small as is
>> practical!).
>>
>> What I would like to see is the relational API standardised (maybe in the
>>> next version of IndexedDB, maybe as a separate standard), so there is a
>>> relationally complete API available, that browser implementers can implement
>>> using a relational storage engine like SQLite directly, or can be emulated
>>> by an implementation on top of IndexedDB where necessary.
>>>
>>
>> We will likely implement something, but it'll hopefully be fairly light
>> weight.  We shouldn't get too ahead of ourselves in speculating about
>> optimizations until people actually start building real world apps against
>> IndexedDB though.  I'm hopeful this will start soon as both Mozilla and
>> WebKit/Chromium's implementations are becoming closer and closer to fully
>> implementing the spec.
>>
>>
>> On Tue, Oct 26, 2010 at 8:54 PM, Tab Atkins Jr. <jackalmage@gmail.com>wrote:
>>
>>> On Tue, Oct 26, 2010 at 12:04 PM, Keean Schupke <keean@fry-it.com>
>>> wrote:
>>> > Take Firefox for example, it implements IndexedDB using SQLite
>>> apparently.
>>> > So implementing a relational API if we have to talk to IndexedDB that
>>> means
>>> > we have to convert from the relational data model to an object model
>>> and
>>> > then back to a relational model for SQLite. So what I would like to do
>>> is
>>> > punch through that excess layer in the middle and have the relational
>>> API
>>> > talk directly to SQLite in the browser implementation. How could you
>>> argue
>>> > that having an unnecessary middle layer is a good thing?
>>>
>>> The SQLite back-end used by Firefox's implementation of IndexedDB (and
>>> Chrome's, for the moment) is unnecessary; at least in Chrome's case,
>>> we used a SQLite backend only because it was expedient and the code
>>> was there.  We'll be changing it to a better backend in the future,
>>> and I suspect that Firefox will do the same in time.
>>>
>>> The middle layer isn't unnecessary, *it's the whole point*.  The
>>> back-end shouldn't ever be exposed directly - you don't want your code
>>> to break if we drop the SQLite backend and switch to a direct
>>> b-tree-based backend.
>>>
>>
>> ...or a fractal tree.  :-P  (Still would love pointers to open
>> source implementations.)
>>
>> J
>>
>
>
Received on Tuesday, 26 October 2010 21:46:57 UTC