Re: [IndexedDB] Current editor's draft

Hi,

I would like to propose that we update the current spec to reflect all
the changes we have agreement on. We can then iteratively review and
make edits as soon as the remaining issues are solved.  Concretely, I
would like to check in a fix for

http://www.w3.org/Bugs/Public/show_bug.cgi?id=9975

with the following two exceptions which, based on the feedback in this
thread, require more discussion:

- leave in support for dynamic transactions but add a separate API for
it, as suggested by Jonas earlier in this thread.
- leave in the explicit transaction commit
- leave in nested transactions

The changes in 9975 have been debated for more than two month now, so
I feel it's about time to update the specification so that it's in
line with what we're actually discussing.

Thanks,
Andrei

On Wed, Jul 14, 2010 at 8:10 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
> On Wed, Jul 14, 2010 at 3:52 AM, Pablo Castro <Pablo.Castro@microsoft.com>
> wrote:
>>
>> From: public-webapps-request@w3.org [mailto:public-webapps-request@w3.org]
>> On Behalf Of Andrei Popescu
>> Sent: Monday, July 12, 2010 5:23 AM
>>
>> Sorry I disappeared for a while. Catching up with this discussion was an
>> interesting exercise...
>
> Yes, Indeed.  :-)
>
>>
>> there is no particular message in this thread I can respond to, so I
>> thought I'd just reply to the last one.
>
> Probably a good idea.  I was trying to respond hixie style--which is harder
> than it looks on stuff like this.
>
>>
>> Overall I think the new proposal is shaping up well and is being effective
>> in simplifying scenarios. I do have a few suggestions and questions for
>> things I'm not sure I see all the way.
>>
>> READ_ONLY vs READ_WRITE as defaults for transactions:
>> To be perfectly honest, I think this discussion went really deep over an
>> issue that won't be a huge deal for most people. My perspective, trying to
>> avoid performance or usage frequency speculation, is around what's easier to
>> detect. Concurrency issues are hard to see. On the other hand, whenever we
>> can throw an exception and give explicit guidance that unblocks people right
>> away. For this case I suspect it's best to default to READ_ONLY, because if
>> someone doesn't read or think about it and just uses the stuff and tries to
>> change something they'll get a clear error message saying "if you want to
>> change stuff, use READ_WRITE please". The error is not data- or
>> context-dependent, so it'll fail on first try at most once per developer and
>> once they fix it they'll know for all future cases.
>
> Couldn't have said it better myself.
>
>>
>> Dynamic transactions:
>> I see that most folks would like to see these going away. While I like the
>> predictability and simplifications that we're able to make by using static
>> scopes for transactions, I worry that we'll close the door for two
>> scenarios: background tasks and query processors. Background tasks such as
>> synchronization and post-processing of content would seem to be almost
>> impossible with the static scope approach, mostly due to the granularity of
>> the scope specification (whole stores). Are we okay with saying that you
>> can't for example sync something in the background (e.g. in a worker) while
>> your app is still working? Am I missing something that would enable this
>> class of scenarios? Query processors are also tricky because you usually
>> take the query specification in some form after the transaction started
>> (especially if you want to execute multiple queries with later queries
>> depending on the outcome of the previous ones). The background tasks issue
>> in particular looks pretty painful to me if we don't have a way to achieve
>> it without freezing the application while it happens.
>
> Well, the application should never freeze in terms of the UI locking up, but
> in what you described I could see it taking a while for data to show up on
> the screen.  This is something that can be fixed by doing smaller updates on
> the background thread, sending a message to the background thread that it
> should abort for now, doing all database access on the background thread,
> etc.
> One point that I never saw made in the thread that I think is really
> important is that dynamic transactions can make concurrency worse in some
> cases.  For example, with dynamic transactions you can get into live-lock
> situations.  Also, using Pablo's example, you could easily get into a
> situation where the long running transaction on the worker keeps hitting
> serialization issues and thus it's never able to make progress.
> I do see that there are use cases where having dynamic transactions would be
> much nicer, but the amount of non-determinism they add (including to
> performance) has me pretty worried.  I pretty firmly believe we should look
> into adding them in v2 and remove them for now.  If we do leave them in, it
> should definitely be in its own method to make it quite clear that the
> semantics are more complex.
>
>>
>> Implicit commit:
>> Does this really work? I need to play with sample app code more, it may
>> just be that I'm old-fashioned. For example, if I'm downloading a bunch of
>> data form somewhere and pushing rows into the store within a transaction,
>> wouldn't it be reasonable to do the whole thing in a transaction? In that
>> case I'm likely to have to unwind while I wait for the next callback from
>> XmlHttpRequest with the next chunk of data. I understand that avoiding it
>> results in nicer patterns (e.g. db.objectStores("foo").get(123).onsuccess =
>> ...), but in practice I'm not sure if that will hold given that you still
>> need error callbacks and such.
>
> I believe your example of doing XHRs in the middle of a transaction is
> something we were explicitly trying to avoid making possible.  In this case,
> you should do all of your XHRs first and then do your transaction.  If you
> need to read form the ObjectStore, do a XHR, and then write to the
> ObjectStore, you can implement it with 2 transactions and have the second
> one verify the data has not changed before doing the actual work.
> Allowing things like XHRs in the middle of an operation will encourage
> really long running transactions that will be really bad for concurrency and
> make the transaction system much less elegant than it currently is.
>
>
>>
>> Nested transactions:
>> Not sure why we're considering this an advanced scenario. To be clear
>> about what the feature means to me: make it legal to start a transaction
>> when one is already in progress, and the nested one is effectively a no-op,
>> just refcounts the transaction, so you need equal amounts of commit()'s,
>> implicit or explicit, and an abort() cancels all nested transactions. The
>> purpose of this is to allow composition, where a piece of code that needs a
>> transaction can start one locally, independently of whether the caller had
>> already one going.
>
> I believe it's actually a bit more tricky than what you said.  For example,
> if we only support static transactions, will we require that any nested
> transaction only request a subset of the locks the outer one took?  What if
> we try to start a dynamic transaction inside of a static one?  Etc.  But I
> agree it's not _that_ tricky and I'm also not convinced it's an "advanced"
> feature.
> I'd suggest we take it out for now and look at re-adding it when the basics
> of the async API are more solidified.  I hope we can get it into v1, but we
> have too much in the air right now as is.
>
>> Schema versioning:
>> It's unfortunate that we need to have explicit elements in the page for
>> the versioning protocol to work, but the fact that we can have a reliable
>> mechanism for pages to coordinate a version bump is really nice. For folks
>> that don't know about this the first time they build it, an explicit error
>> message on the schema change timeout can explain where to start. I do think
>> that there may be a need for non-breaking changes to the schema to happen
>> without a "version dance". For example, query processors regularly create
>> temporary tables during sorts and such. Those shouldn't require any
>> coordination (maybe we allow non-versioned additions, or we just introduce
>> temporary, unnamed tables that evaporate on commit() or database
>> close()...).
>
> I agree we should have a way to do non-beaking changes to the schema at some
> point, but I believe it can wait till v2 at this point.  Temporary
> objectStores seems to be the leading reason why people want this now, so
> maybe we should consider adding them to the spec now.  That said, I'm still
> not convinced that there are many use cases where one needs them.
>  Everything you can do with a temporary objectStore you should be able to do
> in memory as well.  And thus the only reason to add them is if we're handing
> enough data that some will spill to disk.  And I'm not convinced this will
> be a very mainstream scenario.  Especially since one should be able to do
> merge joins in many cases.
> I feel strongly that what Jonas has proposed is what we should do for v1.  I
> think he's explained the reasoning behind the API pretty well in the thread.
>
> Other points:
> *_NO_DUPLICATES:
> I'm still not convinced we need this in v1.  It will help performance in
> some cases, but it adds more API surface area than immediately meets the
> eye.  If we do decide to have it in v1, we need to resolve the issues Jonas
> brought up.  Ideally we would do this on the thread Jonas started
> ("[IndexedDB] .value of no-duplicate cursors").
> Pre-loaded cursors + getAll:
> I'm glad we've decided to take these out for the time being.
> J

Received on Wednesday, 14 July 2010 12:21:10 UTC