Re: Replacing WebSQL with a Relational Data Model. from Keean Schupke on 2010-10-26 (public-webapps@w3.org from October to December 2010)

From: Keean Schupke <keean@fry-it.com>
Date: Tue, 26 Oct 2010 13:35:16 +0100
To: Jeremy Orlow <jorlow@chromium.org>
Cc: Jonas Sicking <jonas@sicking.cc>, public-webapps@w3.org
Message-ID: <AANLkTik1ccrjCvvaM=nuzAjOvNOhnc3ZjHVmZUMg5fzh@mail.gmail.com>
>rather than a standard or anything that needs support from browser vendors.

An implementation of the relational algebra would be ideal for
standardisation as it is:

- complete
- mathematically sound
- has a minimal API (the minimum number of API objects to implement a
complete system).
- can be formally defined, with definitions derived from the definition of
relational algebra.

I believe the best use of this API will be as a replacement for WebSQL,
allowing a database engine like SQLite or MySQL to be used to provide the
functionality. Such use does require standardisation and implementation in
the browsers. IndexedDB is not a replacement for WebSQL at the moment, and
even with joins, I don't think it will compete with a dedicated database for
performance (when dealing with real world queries with joins, compound
indexes etc...)

I am interested in implementing a relational layer on top of IndexedDB for
educational purposes and to help get something standardised, but I don't
think anyone would consider a small relational DB layer in JavaScript a
serious alternative to a decent relational database backend.

What exactly have the MySQL / SQLite / PostgreSQL / MS SQL Server / Oracle
developers been doing all these years? Do we seriously expect to compete
with something written in JavaScript?


Cheers,
Keean.


On 26 October 2010 11:45, Jeremy Orlow <jorlow@chromium.org> wrote:

> Great!  This is exactly the type of thing we were hoping to see happen on
> top of IndexedDB.  :-)
>
> For the record, I don't think the performance comparisons will be super
> useful until browser vendors have time to work on basic performance
> optimizations of their engines and until some effort is put into query
> optimization in the library (or at least giving authors a way to give hints
> on how to optimize it), but as time goes on, such comparisons should be
> quite useful.
>
> As for the standards/support side of your question: such a library would
> actually be much more like a library (like JQuery).  Standards will come
> into play if we pull bits of your API into the standards, much like how CSS
> selectors were quite popular in libraries and eventually got pulled into the
> web platform itself.
>
> When we are ready to add joins and such to the API, we'll definitely be
> looking at what implementations out there are popular.  And even before
> that, any code using IndexedDB will be very helpful for testing and
> optimizing browsers' IndexedDB backends.
>
> J
>
> On Tue, Oct 26, 2010 at 11:16 AM, Keean Schupke <keean@fry-it.com> wrote:
>
>> Hi,
>>
>> I am beginning to think that a basic implementation on top of IndexedDB
>> may be workable (no query optimiser). The advantage would be that once the
>> API becomes standardised, browser implementers may choose to have an
>> alternate backend using SQLite or another RDBMS.
>>
>> So what I would like to propose producing a JavaScript API for the
>> relational algebra. The API will be split into a frontend (query tree
>> construction) which is visible to the user, and a backend that is invisible
>> (either executing the query using IndexedDB, or WebSQL). This would be
>> interesting for several reasons, it would allow direct performance
>> comparisons between IndexedDB and an RDBMS for relational queries, it would
>> provide a framework for implementing a query optimiser for IndexedDB, and
>> would allow testing the IndexedDB relational operators against a known
>> database engine. It would also provide two implementations for
>> standardising.
>>
>> The aim would be to standardise the API, allowing any implementation of
>> the backend.
>>
>>
>> Would this be something that I could get support for?
>>
>>
>> Cheers,
>> Keean.
>>
>>
>> On 26 October 2010 10:51, Jeremy Orlow <jorlow@chromium.org> wrote:
>>
>>> On Tue, Oct 26, 2010 at 8:47 AM, Keean Schupke <keean@fry-it.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> On 26 October 2010 08:26, Mikeal Rogers <mikeal.rogers@gmail.com>
>>>>  wrote:
>>>>
>>>>> On Tue, Oct 26, 2010 at 12:02 AM, Keean Schupke <keean@fry-it.com>
>>>>>  wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> It will be a lot faster with SQLite as the backend.
>>>>>>
>>>>>
>>>>> BTree's are old hat in the database space. Modern storage engines are
>>>>>> using Fractal-Trees for a 50-80 times speedup in write performance. The
>>>>>> browser should not be trying to compete with experienced database coders.
>>>>>>
>>>>>
>>>>> SQLite uses a btree :\
>>>>>
>>>>> *most* databases still use btrees for all their primary indexes.
>>>>>
>>>>
>>>> See TokuDB for MySQL, its possible for an implementer to choose this as
>>>> a backend - might give a competitive advantage.
>>>>
>>>> My point is more about the years of research, and constant development
>>>> taking place in database apps. Maybe SQLite will implement a Fractal-Tree
>>>> index soon.
>>>>
>>>
>>>  Is this just speculation or is it actually planned?
>>>
>>> Why couldn't an IndexedDB implementation be based on fractal trees?  I
>>> don't see any reason they couldn't.
>>>
>>>
>>>> Do the browser implementers really want to be implementing all this DB
>>>> code to stay up to date. Why so keen to create work? Why can't we just
>>>> re-use the available storage engines with a nice API?
>>>>
>>>
>>> Because we need multiple independent implementations of an engine.  We
>>> need to spec everything from the syntax of the language (be it yours, SQL,
>>> or something else) down to how queries are optimized.  The latter part is
>>> very worth noting because this is the primary reason SQL code is not easily
>>> portable between engines.  Sure it'll run, but it won't run fast.  Given how
>>> long SQL's been around and the fact that different engines optimize very
>>> differently, I don't think you can practically argue this isn't an issue.
>>>  And the fact that we need multiple implementations means that someone will
>>> have to create work, even if we base some standard on an existing
>>> implementation.
>>>
>>>  The relational storage model I am proposing will return a database
>>>>>> table as a list of objects, where each object represents a row, and the
>>>>>> object has a set of named properties representing each column.
>>>>>>
>>>>>> Part of the power of the Relational Data Model is that it abstracts
>>>>>> data into columns and tables, and this is precisely what we want.
>>>>>>
>>>>>
>>>>> Who is "we"?
>>>>>
>>>>
>>>> We in this case is the company I work for (see my introductory email)
>>>> and some some of our customers that we have worked with.
>>>>
>>>>
>>>>> The relational data model doesn't fit the web very well which is why
>>>>> IDB was developed.
>>>>>
>>>>
>>>> The relational model is great and used in native mobile apps. iPhone and
>>>> Android both provide SQLite and use it heavily. I think if you see how neat
>>>> the API is, you might change your mind.
>>>>
>>>
>>> The issue is not some sort of mobile vs. web split.  If anything the
>>> split has to do with using a fairly dynamic language vs. fairly static ones.
>>>
>>> As for the API, it might be worth your time to explain it further and
>>> write some examples of real code that'd use the API if you wish to change
>>> minds.
>>>
>>>
>>>>  Most web developers I talk to do not want a relational database in the
>>>>> browser but they do want something better than what is currently there.
>>>>>
>>>>
>>>> We would have been happy with WebSQL, but I can see the problems with
>>>> standardising it.
>>>>
>>>
>>> Are you sure you do?  Most of the problems I see with your arguments and
>>> your proposals are the same as the problems with WebSQL.
>>>
>>>
>>>>  If your relational API isn't fast enough built on top of IDB then you
>>>>> should post the performance metrics and efforts can be made to improve the
>>>>> performance.
>>>>>
>>>>
>>>> Have you seen the ammount of code in SQLite? I don't think you
>>>> understand the ammount of work involved in implementing a decent relational
>>>> database engine.
>>>>
>>>
>>> ...and so you're advocating that we write _another_ database engine
>>> that's far more complex than the one we're currently writing?
>>>
>>>
>>>
>>> On Tue, Oct 26, 2010 at 8:20 AM, Keean Schupke <keean@fry-it.com> wrote:
>>>
>>>> > can say with almost certainty that we're not going to add yet another
>>>> storage mechanism to the web platform any time soon, though.  :-)
>>>>
>>>> I am sorry to hear that. SQLite has been a major success on the mobile
>>>> platforms, and most now support a form of SQL database (even J2ME).
>>>> Implementing a RDB on top of IndexedDB will almost certainly be slow. It
>>>> would also be a lot of work (there are years of research and improvements
>>>> that have gone into the query optimisers of most available databases, why
>>>> should all this work be replicated).
>>>>
>>>
>>> Things can always be optimized.  If you find things inherently slow with
>>> the design, then we can look at adding bits to the API.  If you find that
>>> implementations are slow, the best way to make them faster is to make a
>>> killer app and/or benchmark that challenges the browser vendors to compete.
>>>
>>> One of the fundamental purposes of open source and open standards is not
>>>> to re-invent the wheel. And rewriting SQLite in JavaScript would almost
>>>> certainly be a long and difficult task. Surely it would be much better to
>>>> have a standardised RDB API (using something like I proposed) which would be
>>>> a thin API layer and let browser implementers link to SQLite or another
>>>> RDBMS.
>>>>
>>>
>>> I explained above why simply using some existing database engine is not
>>> an option for a W3C standard.
>>>
>>>
>>>
>>> On Tue, Oct 26, 2010 at 8:02 AM, Keean Schupke <keean@fry-it.com> wrote:
>>>
>>> Hi,
>>>>
>>>> It will be a lot faster with SQLite as the backend. Mobile apps depend
>>>> on access to the SQLite engine, and although it _could_ be implemented on
>>>> top of IndexedDB, there is no way its going to be fast enough...
>>>>
>>>
>>> It may be slower now, but there's no inherent reason for it to be.  We're
>>> open to adding to the API (even large chunks of API like a join language!).
>>>  And even without new bits of API, JavaScript keeps getting faster.
>>>
>>> To be honest, I'm quite surprised that you keep holding SQLite up as an
>>> example of something fast.  Many of the teams I've talked to have found that
>>> WebSQLDatabase on top of SQLite scales very poorly and it not adequate for
>>> their uses.
>>>
>>>
>>>> And the thought of writing a decent query optimiser is a bit daunting.
>>>>
>>>> The benefit of making it a standard, is that browser implementers can do
>>>> what they want, implement in JavaScript on top of IndexedDB, or pass through
>>>> to a proper database.
>>>>
>>>
>>> I don't understand this.
>>>
>>>
>>>> BTree's are old hat in the database space. Modern storage engines are
>>>> using Fractal-Trees for a 50-80 times speedup in write performance. The
>>>> browser should not be trying to compete with experienced database coders.
>>>>
>>>
>>> Are there any open source libraries implementing this?  I'd love to use
>>> one as the backing engine for WebKit's IndexedDB.  I have no interest in
>>> writing more code than I have to.
>>>
>>>
>>>> The relational storage model I am proposing will return a database table
>>>> as a list of objects, where each object represents a row, and the object has
>>>> a set of named properties representing each column.
>>>>
>>>
>>> Would joined tables be part of the object as well?  If so, this would be
>>> kind of cool.
>>>
>>>
>>>> Part of the power of the Relational Data Model is that it abstracts data
>>>> into columns and tables, and this is precisely what we want.
>>>>
>>>
>>> It's not what most JavaScript developers seem to want, though.
>>>
>>>
>>>
>>> On Tue, Oct 26, 2010 at 10:03 AM, Keean Schupke <keean@fry-it.com>wrote:
>>>
>>>> I would prefer to fit in with what everyone is already doing. There
>>>> would be no point in starting a new standard if none of the browser
>>>> implementers are interested in this. I would also prefer to have the
>>>> cooperation of an experiences W3 standards editor if this is necessary.
>>>
>>>
>>> For what it's worth, IndexedDB was originally written by someone without
>>> any W3 standards experience.  Once the idea caught on, we started banging it
>>> into proper shape.
>>>
>>> I agree with Jonas that the best first step is implementing the API in
>>> JavaScript though.  I'd say prototyping and advanced query optimizer matters
>>> much less than prototyping a good API developers would be using, since
>>> performance can always be addressed.
>>>
>>> J
>>>
>>
>>
>
Received on Tuesday, 26 October 2010 12:35:52 UTC