Re: Replacing WebSQL with a Relational Data Model. from Jeremy Orlow on 2010-10-26 (public-webapps@w3.org from October to December 2010)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Tue, 26 Oct 2010 11:45:56 +0100
To: Keean Schupke <keean@fry-it.com>
Cc: Jonas Sicking <jonas@sicking.cc>, public-webapps@w3.org
Message-ID: <AANLkTi=fR12LhtkmcquBdh9Ua_=_bPGh77WAGcQSbcFo@mail.gmail.com>
Great!  This is exactly the type of thing we were hoping to see happen on
top of IndexedDB.  :-)

For the record, I don't think the performance comparisons will be super
useful until browser vendors have time to work on basic performance
optimizations of their engines and until some effort is put into query
optimization in the library (or at least giving authors a way to give hints
on how to optimize it), but as time goes on, such comparisons should be
quite useful.

As for the standards/support side of your question: such a library would
actually be much more like a library (like JQuery) rather than a standard or
anything that needs support from browser vendors.  Standards will come into
play if we pull bits of your API into the standards, much like how CSS
selectors were quite popular in libraries and eventually got pulled into the
web platform itself.

When we are ready to add joins and such to the API, we'll definitely be
looking at what implementations out there are popular.  And even before
that, any code using IndexedDB will be very helpful for testing and
optimizing browsers' IndexedDB backends.

J

On Tue, Oct 26, 2010 at 11:16 AM, Keean Schupke <keean@fry-it.com> wrote:

> Hi,
>
> I am beginning to think that a basic implementation on top of IndexedDB may
> be workable (no query optimiser). The advantage would be that once the API
> becomes standardised, browser implementers may choose to have an alternate
> backend using SQLite or another RDBMS.
>
> So what I would like to propose producing a JavaScript API for the
> relational algebra. The API will be split into a frontend (query tree
> construction) which is visible to the user, and a backend that is invisible
> (either executing the query using IndexedDB, or WebSQL). This would be
> interesting for several reasons, it would allow direct performance
> comparisons between IndexedDB and an RDBMS for relational queries, it would
> provide a framework for implementing a query optimiser for IndexedDB, and
> would allow testing the IndexedDB relational operators against a known
> database engine. It would also provide two implementations for
> standardising.
>
> The aim would be to standardise the API, allowing any implementation of the
> backend.
>
>
> Would this be something that I could get support for?
>
>
> Cheers,
> Keean.
>
>
> On 26 October 2010 10:51, Jeremy Orlow <jorlow@chromium.org> wrote:
>
>> On Tue, Oct 26, 2010 at 8:47 AM, Keean Schupke <keean@fry-it.com> wrote:
>>
>>> Hi,
>>>
>>> On 26 October 2010 08:26, Mikeal Rogers <mikeal.rogers@gmail.com> wrote:
>>>
>>>> On Tue, Oct 26, 2010 at 12:02 AM, Keean Schupke <keean@fry-it.com>
>>>>  wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> It will be a lot faster with SQLite as the backend.
>>>>>
>>>>
>>>> BTree's are old hat in the database space. Modern storage engines are
>>>>> using Fractal-Trees for a 50-80 times speedup in write performance. The
>>>>> browser should not be trying to compete with experienced database coders.
>>>>>
>>>>
>>>> SQLite uses a btree :\
>>>>
>>>> *most* databases still use btrees for all their primary indexes.
>>>>
>>>
>>> See TokuDB for MySQL, its possible for an implementer to choose this as a
>>> backend - might give a competitive advantage.
>>>
>>> My point is more about the years of research, and constant development
>>> taking place in database apps. Maybe SQLite will implement a Fractal-Tree
>>> index soon.
>>>
>>
>>  Is this just speculation or is it actually planned?
>>
>> Why couldn't an IndexedDB implementation be based on fractal trees?  I
>> don't see any reason they couldn't.
>>
>>
>>> Do the browser implementers really want to be implementing all this DB
>>> code to stay up to date. Why so keen to create work? Why can't we just
>>> re-use the available storage engines with a nice API?
>>>
>>
>> Because we need multiple independent implementations of an engine.  We
>> need to spec everything from the syntax of the language (be it yours, SQL,
>> or something else) down to how queries are optimized.  The latter part is
>> very worth noting because this is the primary reason SQL code is not easily
>> portable between engines.  Sure it'll run, but it won't run fast.  Given how
>> long SQL's been around and the fact that different engines optimize very
>> differently, I don't think you can practically argue this isn't an issue.
>>  And the fact that we need multiple implementations means that someone will
>> have to create work, even if we base some standard on an existing
>> implementation.
>>
>>  The relational storage model I am proposing will return a database table
>>>>> as a list of objects, where each object represents a row, and the object has
>>>>> a set of named properties representing each column.
>>>>>
>>>>> Part of the power of the Relational Data Model is that it abstracts
>>>>> data into columns and tables, and this is precisely what we want.
>>>>>
>>>>
>>>> Who is "we"?
>>>>
>>>
>>> We in this case is the company I work for (see my introductory email) and
>>> some some of our customers that we have worked with.
>>>
>>>
>>>> The relational data model doesn't fit the web very well which is why IDB
>>>> was developed.
>>>>
>>>
>>> The relational model is great and used in native mobile apps. iPhone and
>>> Android both provide SQLite and use it heavily. I think if you see how neat
>>> the API is, you might change your mind.
>>>
>>
>> The issue is not some sort of mobile vs. web split.  If anything the split
>> has to do with using a fairly dynamic language vs. fairly static ones.
>>
>> As for the API, it might be worth your time to explain it further and
>> write some examples of real code that'd use the API if you wish to change
>> minds.
>>
>>
>>>  Most web developers I talk to do not want a relational database in the
>>>> browser but they do want something better than what is currently there.
>>>>
>>>
>>> We would have been happy with WebSQL, but I can see the problems with
>>> standardising it.
>>>
>>
>> Are you sure you do?  Most of the problems I see with your arguments and
>> your proposals are the same as the problems with WebSQL.
>>
>>
>>>  If your relational API isn't fast enough built on top of IDB then you
>>>> should post the performance metrics and efforts can be made to improve the
>>>> performance.
>>>>
>>>
>>> Have you seen the ammount of code in SQLite? I don't think you understand
>>> the ammount of work involved in implementing a decent relational database
>>> engine.
>>>
>>
>> ...and so you're advocating that we write _another_ database engine that's
>> far more complex than the one we're currently writing?
>>
>>
>>
>> On Tue, Oct 26, 2010 at 8:20 AM, Keean Schupke <keean@fry-it.com> wrote:
>>
>>> > can say with almost certainty that we're not going to add yet another
>>> storage mechanism to the web platform any time soon, though.  :-)
>>>
>>> I am sorry to hear that. SQLite has been a major success on the mobile
>>> platforms, and most now support a form of SQL database (even J2ME).
>>> Implementing a RDB on top of IndexedDB will almost certainly be slow. It
>>> would also be a lot of work (there are years of research and improvements
>>> that have gone into the query optimisers of most available databases, why
>>> should all this work be replicated).
>>>
>>
>> Things can always be optimized.  If you find things inherently slow with
>> the design, then we can look at adding bits to the API.  If you find that
>> implementations are slow, the best way to make them faster is to make a
>> killer app and/or benchmark that challenges the browser vendors to compete.
>>
>> One of the fundamental purposes of open source and open standards is not
>>> to re-invent the wheel. And rewriting SQLite in JavaScript would almost
>>> certainly be a long and difficult task. Surely it would be much better to
>>> have a standardised RDB API (using something like I proposed) which would be
>>> a thin API layer and let browser implementers link to SQLite or another
>>> RDBMS.
>>>
>>
>> I explained above why simply using some existing database engine is not an
>> option for a W3C standard.
>>
>>
>>
>> On Tue, Oct 26, 2010 at 8:02 AM, Keean Schupke <keean@fry-it.com> wrote:
>>
>> Hi,
>>>
>>> It will be a lot faster with SQLite as the backend. Mobile apps depend on
>>> access to the SQLite engine, and although it _could_ be implemented on top
>>> of IndexedDB, there is no way its going to be fast enough...
>>>
>>
>> It may be slower now, but there's no inherent reason for it to be.  We're
>> open to adding to the API (even large chunks of API like a join language!).
>>  And even without new bits of API, JavaScript keeps getting faster.
>>
>> To be honest, I'm quite surprised that you keep holding SQLite up as an
>> example of something fast.  Many of the teams I've talked to have found that
>> WebSQLDatabase on top of SQLite scales very poorly and it not adequate for
>> their uses.
>>
>>
>>> And the thought of writing a decent query optimiser is a bit daunting.
>>>
>>> The benefit of making it a standard, is that browser implementers can do
>>> what they want, implement in JavaScript on top of IndexedDB, or pass through
>>> to a proper database.
>>>
>>
>> I don't understand this.
>>
>>
>>> BTree's are old hat in the database space. Modern storage engines are
>>> using Fractal-Trees for a 50-80 times speedup in write performance. The
>>> browser should not be trying to compete with experienced database coders.
>>>
>>
>> Are there any open source libraries implementing this?  I'd love to use
>> one as the backing engine for WebKit's IndexedDB.  I have no interest in
>> writing more code than I have to.
>>
>>
>>> The relational storage model I am proposing will return a database table
>>> as a list of objects, where each object represents a row, and the object has
>>> a set of named properties representing each column.
>>>
>>
>> Would joined tables be part of the object as well?  If so, this would be
>> kind of cool.
>>
>>
>>> Part of the power of the Relational Data Model is that it abstracts data
>>> into columns and tables, and this is precisely what we want.
>>>
>>
>> It's not what most JavaScript developers seem to want, though.
>>
>>
>>
>> On Tue, Oct 26, 2010 at 10:03 AM, Keean Schupke <keean@fry-it.com> wrote:
>>
>>> I would prefer to fit in with what everyone is already doing. There would
>>> be no point in starting a new standard if none of the browser implementers
>>> are interested in this. I would also prefer to have the cooperation of an
>>> experiences W3 standards editor if this is necessary.
>>
>>
>> For what it's worth, IndexedDB was originally written by someone without
>> any W3 standards experience.  Once the idea caught on, we started banging it
>> into proper shape.
>>
>> I agree with Jonas that the best first step is implementing the API in
>> JavaScript though.  I'd say prototyping and advanced query optimizer matters
>> much less than prototyping a good API developers would be using, since
>> performance can always be addressed.
>>
>> J
>>
>
>
Received on Tuesday, 26 October 2010 10:46:56 UTC