Re: Replacing WebSQL with a Relational Data Model. from Jeremy Orlow on 2010-10-27 (public-webapps@w3.org from October to December 2010)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Wed, 27 Oct 2010 10:57:31 +0100
To: Nathan Kitchen <w3c@nathankitchen.com>
Cc: nathan@webr3.org, public-webapps@w3.org, Jonas Sicking <jonas@sicking.cc>, Keean Schupke <keean@fry-it.com>, Arthur Barstow <art.barstow@nokia.com>
Message-ID: <AANLkTimN-xkx0YFubATe1=yirXmzv_a5O47B9mgBc9yL@mail.gmail.com>

On Wed, Oct 27, 2010 at 8:58 AM, Nathan Kitchen <w3c@nathankitchen.com>wrote:

> [featurecreep]
> There is one more thing that I would like to see in the spec though:
> full-text indexing/inverted index support. It's all well and good having a
> standardized interface to build CRUD operations on top of, but I'd be
> surprised if anyone is going to be able to build performant text search on
> top of an API without *some level* of native support for it.
> [/featurecreep]
>

Being able to do full text searches is very important to Google.  We talked
about it even a year ago.  The problem is that IndexedDB is not well
understood enough to know what needs to be added to enable this use case (if
anything--though I think the chances of nothing being needed are slim).
 What we really need is someone to try prototyping full text search and
present a report on what's inherently slow.  We can then strategically add
to the API.  I'm trying to get a team within Google to spend some time
looking at this.

On Wed, Oct 27, 2010 at 9:04 AM, Keean Schupke <keean@fry-it.com> wrote:

> Just some thoughts that occurred to me this morning:
>
> I guess the issue of exactly how you cache data on the client side varies
>> based on the application, however in all but a few cases it appears to me
>> that the optimal solution is to store the data in a structure/format which
>> the application actually uses. A common example may be storing a User Object
>> (complete with nested objects) rather than a several tables each with a
>> component part in it.
>>
>
> This is interesting. When dealing with data I find it easier for the
> application to work with the data in a relational format. With a
> relationally complete language everything you can do in your normal
> programming language you can do in the relational language (except doing
> things like the transitive closure). The relational language has the
> advantage that you can do all this without needing loops. From a computer
> language point of view asking two join two relations is higher level than
> looping through and looking up links (because you are specifying what you
> want not how to do it). This is like the difference between programming an
> imperative language an Prolog.
>
> Once you have learned a relational language like SQL and internalised it,
> it is much easier to manipulate the data by specifying the what, and not
> having to bother about the how. So with SQL you try an do all the processing
> in SQL.
>

I suspect you have more SQL experience than 99% of developers using the web
platform.  And those who haven't done a lot of SQL in their life, typically
find it hard to work with.  And even some of those who have would rather
not.

>  This is a very good example! You may well be interested in having a quick
>> look at the "Motivating Writing" section of [6] which outlines just such an
>> application, and that document together with [7] covers a slightly different
>> approach to getting that contacts phone number displayed alongside each
>> appointment, using Links rather than Joins.
>
>
> Is this fast enough? Power users have 5000 contacts, Just getting HTML5 to
> produce a scrolling list that long that is fast enough to fling properly on
> a mobile device is a struggle.
>

This is typically addressed by
http://en.wikipedia.org/wiki/Flyweight_pattern whether or not the backing
data is loaded into memory.  But it's worth noting that the types of
workloads Google is hoping to do eventually is store tens of thousands of
emails and as full of an on-line experience as possible.  This is beyond the
limit where SQLite starts to choke in our experience.

On Wed, Oct 27, 2010 at 9:50 AM, Keean Schupke <keean@fry-it.com> wrote:

> On that point it should be possible to build an efficient text search on
> top of IndexedDB. You need a "word" index that links to multiple documents.
> Matching documents are found by taking the intersection of the sets of
> documents found for each word in the query (for an unstructured query). As
> such you would put the documents in localStorage,

Why would LocalStorage be involved at all?  Just keep the data in an
ObjectStore.

J

Received on Wednesday, 27 October 2010 09:58:33 UTC