Re: Replacing WebSQL with a Relational Data Model. from Keean Schupke on 2010-10-26 (public-webapps@w3.org from October to December 2010)

From: Keean Schupke <keean@fry-it.com>
Date: Tue, 26 Oct 2010 13:52:17 +0100
To: Jeremy Orlow <jorlow@chromium.org>
Cc: Jonas Sicking <jonas@sicking.cc>, public-webapps@w3.org
Message-ID: <AANLkTi=HVdx2a95XGWbWi=TLtMFpgDHBMzkZ9_wi_r7j@mail.gmail.com>
Hi,

Just want to pick up on this point:

>down to how queries are optimized.  The latter part is very worth noting
because this is the primary reason SQL
>code is not easily portable between engines.  Sure it'll run, but it won't
run fast.  Given how long SQL's been
>around and the fact that different engines optimize very differently, I
don't think you can practically argue this
>isn't an issue.

I disagree, and I will argue the point. I don't see an argument for
including optimisation strategies in a standard here. All DB engines make a
good job of optimising most queries. I have written SQL for Oracle,
Microsoft SQL Server, PostgreSQL, MySQL, and SQLite, and this has never been
a problem. More likely bad performance is due to a bad schema and a bad
design.

I saw an interesting article about how Digg could have saved itself the
trouble of migrating from MySQL to Cassandra, if it had used the correct
compound index on one table in its database. The two separate index approach
resulted in 5000 seeks per query, down to only 5 with the compound index.
This kind of structural optimisation applies the same to all SQL databases,
and has nothing to do with the query optimiser.


Cheers,
Keean.


On 26 October 2010 10:51, Jeremy Orlow <jorlow@chromium.org> wrote:

> On Tue, Oct 26, 2010 at 8:47 AM, Keean Schupke <keean@fry-it.com> wrote:
>
>> Hi,
>>
>> On 26 October 2010 08:26, Mikeal Rogers <mikeal.rogers@gmail.com> wrote:
>>
>>> On Tue, Oct 26, 2010 at 12:02 AM, Keean Schupke <keean@fry-it.com>
>>>  wrote:
>>>
>>>> Hi,
>>>>
>>>> It will be a lot faster with SQLite as the backend.
>>>>
>>>
>>> BTree's are old hat in the database space. Modern storage engines are
>>>> using Fractal-Trees for a 50-80 times speedup in write performance. The
>>>> browser should not be trying to compete with experienced database coders.
>>>>
>>>
>>> SQLite uses a btree :\
>>>
>>> *most* databases still use btrees for all their primary indexes.
>>>
>>
>> See TokuDB for MySQL, its possible for an implementer to choose this as a
>> backend - might give a competitive advantage.
>>
>> My point is more about the years of research, and constant development
>> taking place in database apps. Maybe SQLite will implement a Fractal-Tree
>> index soon.
>>
>
> Is this just speculation or is it actually planned?
>
> Why couldn't an IndexedDB implementation be based on fractal trees?  I
> don't see any reason they couldn't.
>
>
>> Do the browser implementers really want to be implementing all this DB
>> code to stay up to date. Why so keen to create work? Why can't we just
>> re-use the available storage engines with a nice API?
>>
>
> Because we need multiple independent implementations of an engine.  We need
> to spec everything from the syntax of the language (be it yours, SQL, or
> something else) down to how queries are optimized.  The latter part is very
> worth noting because this is the primary reason SQL code is not easily
> portable between engines.  Sure it'll run, but it won't run fast.  Given how
> long SQL's been around and the fact that different engines optimize very
> differently, I don't think you can practically argue this isn't an issue.
>  And the fact that we need multiple implementations means that someone will
> have to create work, even if we base some standard on an existing
> implementation.
>
>  The relational storage model I am proposing will return a database table
>>>> as a list of objects, where each object represents a row, and the object has
>>>> a set of named properties representing each column.
>>>>
>>>> Part of the power of the Relational Data Model is that it abstracts data
>>>> into columns and tables, and this is precisely what we want.
>>>>
>>>
>>> Who is "we"?
>>>
>>
>> We in this case is the company I work for (see my introductory email) and
>> some some of our customers that we have worked with.
>>
>>
>>> The relational data model doesn't fit the web very well which is why IDB
>>> was developed.
>>>
>>
>> The relational model is great and used in native mobile apps. iPhone and
>> Android both provide SQLite and use it heavily. I think if you see how neat
>> the API is, you might change your mind.
>>
>
> The issue is not some sort of mobile vs. web split.  If anything the split
> has to do with using a fairly dynamic language vs. fairly static ones.
>
> As for the API, it might be worth your time to explain it further and write
> some examples of real code that'd use the API if you wish to change minds.
>
>
>>  Most web developers I talk to do not want a relational database in the
>>> browser but they do want something better than what is currently there.
>>>
>>
>> We would have been happy with WebSQL, but I can see the problems with
>> standardising it.
>>
>
> Are you sure you do?  Most of the problems I see with your arguments and
> your proposals are the same as the problems with WebSQL.
>
>
>>  If your relational API isn't fast enough built on top of IDB then you
>>> should post the performance metrics and efforts can be made to improve the
>>> performance.
>>>
>>
>> Have you seen the ammount of code in SQLite? I don't think you understand
>> the ammount of work involved in implementing a decent relational database
>> engine.
>>
>
> ...and so you're advocating that we write _another_ database engine that's
> far more complex than the one we're currently writing?
>
>
>
> On Tue, Oct 26, 2010 at 8:20 AM, Keean Schupke <keean@fry-it.com> wrote:
>
>> > can say with almost certainty that we're not going to add yet another
>> storage mechanism to the web platform any time soon, though.  :-)
>>
>> I am sorry to hear that. SQLite has been a major success on the mobile
>> platforms, and most now support a form of SQL database (even J2ME).
>> Implementing a RDB on top of IndexedDB will almost certainly be slow. It
>> would also be a lot of work (there are years of research and improvements
>> that have gone into the query optimisers of most available databases, why
>> should all this work be replicated).
>>
>
> Things can always be optimized.  If you find things inherently slow with
> the design, then we can look at adding bits to the API.  If you find that
> implementations are slow, the best way to make them faster is to make a
> killer app and/or benchmark that challenges the browser vendors to compete.
>
> One of the fundamental purposes of open source and open standards is not to
>> re-invent the wheel. And rewriting SQLite in JavaScript would almost
>> certainly be a long and difficult task. Surely it would be much better to
>> have a standardised RDB API (using something like I proposed) which would be
>> a thin API layer and let browser implementers link to SQLite or another
>> RDBMS.
>>
>
> I explained above why simply using some existing database engine is not an
> option for a W3C standard.
>
>
>
> On Tue, Oct 26, 2010 at 8:02 AM, Keean Schupke <keean@fry-it.com> wrote:
>
> Hi,
>>
>> It will be a lot faster with SQLite as the backend. Mobile apps depend on
>> access to the SQLite engine, and although it _could_ be implemented on top
>> of IndexedDB, there is no way its going to be fast enough...
>>
>
> It may be slower now, but there's no inherent reason for it to be.  We're
> open to adding to the API (even large chunks of API like a join language!).
>  And even without new bits of API, JavaScript keeps getting faster.
>
> To be honest, I'm quite surprised that you keep holding SQLite up as an
> example of something fast.  Many of the teams I've talked to have found that
> WebSQLDatabase on top of SQLite scales very poorly and it not adequate for
> their uses.
>
>
>> And the thought of writing a decent query optimiser is a bit daunting.
>>
>> The benefit of making it a standard, is that browser implementers can do
>> what they want, implement in JavaScript on top of IndexedDB, or pass through
>> to a proper database.
>>
>
> I don't understand this.
>
>
>> BTree's are old hat in the database space. Modern storage engines are
>> using Fractal-Trees for a 50-80 times speedup in write performance. The
>> browser should not be trying to compete with experienced database coders.
>>
>
> Are there any open source libraries implementing this?  I'd love to use one
> as the backing engine for WebKit's IndexedDB.  I have no interest in writing
> more code than I have to.
>
>
>> The relational storage model I am proposing will return a database table
>> as a list of objects, where each object represents a row, and the object has
>> a set of named properties representing each column.
>>
>
> Would joined tables be part of the object as well?  If so, this would be
> kind of cool.
>
>
>> Part of the power of the Relational Data Model is that it abstracts data
>> into columns and tables, and this is precisely what we want.
>>
>
> It's not what most JavaScript developers seem to want, though.
>
>
>
> On Tue, Oct 26, 2010 at 10:03 AM, Keean Schupke <keean@fry-it.com> wrote:
>
>> I would prefer to fit in with what everyone is already doing. There would
>> be no point in starting a new standard if none of the browser implementers
>> are interested in this. I would also prefer to have the cooperation of an
>> experiences W3 standards editor if this is necessary.
>
>
> For what it's worth, IndexedDB was originally written by someone without
> any W3 standards experience.  Once the idea caught on, we started banging it
> into proper shape.
>
> I agree with Jonas that the best first step is implementing the API in
> JavaScript though.  I'd say prototyping and advanced query optimizer matters
> much less than prototyping a good API developers would be using, since
> performance can always be addressed.
>
> J
>
Received on Tuesday, 26 October 2010 12:52:57 UTC