Re: [IndexedDB] IDBObjectStore.delete should accept a KeyRange from Jonas Sicking on 2011-11-07 (public-webapps@w3.org from October to December 2011)

From: Jonas Sicking <jonas@sicking.cc>
Date: Sun, 6 Nov 2011 16:13:34 -0800
To: Israel Hilerio <israelh@microsoft.com>
Cc: Joshua Bell <jsbell@chromium.org>, "public-webapps@w3.org" <public-webapps@w3.org>, Adam Herchenroether <aherchen@microsoft.com>, Victor Ngo <vicngo@microsoft.com>, Jim Wordelman <jaword@microsoft.com>
Message-ID: <CA+c2ei8NHzPqOAeWDxzL7mfxqjRvUE1tW3woxdo6XWQrkMcU2A@mail.gmail.com>
On Fri, Oct 28, 2011 at 9:55 AM, Jonas Sicking <jonas@sicking.cc> wrote:
> On Fri, Oct 28, 2011 at 9:29 AM, Israel Hilerio <israelh@microsoft.com> wrote:
>> On Thursday, October 27, 2011 6:00 PM, Jonas Sicking wrote:
>>> On Thu, Oct 27, 2011 at 9:33 AM, Israel Hilerio <israelh@microsoft.com>
>>> wrote:
>>> > On Wednesday, October 26, 2011 6:38 PM, Jonas Sicking wrote:
>>> >> On Wed, Oct 26, 2011 at 11:31 AM, Israel Hilerio
>>> >> <israelh@microsoft.com>
>>> >> wrote:
>>> >> > On Wednesday, October 26, 2011 9:35 AM, Joshua Bell wrote:
>>> >> >>On Tue, Oct 25, 2011 at 4:50 PM, Israel Hilerio
>>> >> >><israelh@microsoft.com>
>>> >> wrote:
>>> >> >>>On Monday, October 24, 2011 7:40 PM, Jonas Sicking wrote:
>>> >> >>>>
>>> >> >>>> While I was there it did occur to me that the fact that the
>>> >> >>>> .delete function "returns" (through request.result in the async
>>> >> >>>> API) true/false depending on if any records were removed or not
>>> >> >>>> might be
>>> >> bad for performance.
>>> >> >>>>
>>> >> >>>> I suspect that this highly depends on the implementation and
>>> >> >>>> that in some implementations knowing if records were deleted
>>> >> >>>> will be free and in others it will be as costly as a .count()
>>> >> >>>> and then a .delete(). In yet others it could depend on if a
>>> >> >>>> range, rather than a key, was used, or if the objectStore has
>>> >> >>>> indexes which might need
>>> >> updating.
>>> >> >>>>
>>> >> >>>> Ultimately I don't have a strong preference either way, though
>>> >> >>>> it seems unfortunate to slow down implementations for what
>>> >> >>>> likely is a
>>> >> rare use case.
>>> >> >>>>
>>> >> >>>> Let me know what you think.
>>> >> >>>>
>>> >> >>>> / Jonas
>>> >> >>>>
>>> >> >
>>> >> >>>To clarify, removing the return value from the sync call would
>>> >> >>>change its
>>> >> return signature to void.  In this case, >>successfully returning
>>> >> from the IDBObjectStore.delete call would mean that the information
>>> >> was successfully
>>> >> >>deleted, correct?  If the information was not successfully deleted,
>>> >> >>would we
>>> >> throw an exception?
>>> >> >>>
>>> >> >>>In the async case, we would keep the same return value of
>>> >> >>>IDBRequest for
>>> >> IDBObjectStore.delete.  The only change is >>that the request.result
>>> >> would be null, correct?  If no information is deleted or if part of
>>> >> the keyRange data is deleted, >>should we throw an error event?  It seems
>>> reasonable to me.
>>> >> >
>>> >> >>When you write "If no information is deleted ... should we throw an
>>> >> >>error
>>> >> event?" do you mean (1) there was no >matching key so the delete was
>>> >> a no- op, or (2) there was a matching key but an internal error
>>> >> occurred preventing the >delete? I ask because the second clause, "if
>>> >> part of the keyRange data is deleted, should we throw an error
>>> >> event?" >doesn't make sense to me in interpretation (1) since I'd expect
>>> sparse ranges in many cases.
>>> >> >
>>> >> > I was originally referring to (1) and (2).  However, after
>>> >> > discussing this with
>>> >> a couple of folks we believe that the better approach would be to:
>>> >> > * continue to return true or false in the result.  This will take
>>> >> > care of (1) and
>>> >> the successful deletion of all records.
>>> >> > * publish an error event if (2).  What I meant by (2) is that if
>>> >> > there was a
>>> >> successful set of matches that were able to be returned by the
>>> >> keyRange, we should guarantee the deletion of all the matches or none.
>>> >> >
>>> >> > However, (2) brings up a bigger issue.  We are basically saying
>>> >> > that if we
>>> >> support deletion of keyRanges we are guaranteeing that the batch
>>> >> operation will all happen or none of it will happen.  This implies
>>> >> some type of inner- transaction associated only with the delete
>>> >> operation, which could also be rolledback as part of the
>>> >> outer-transaction.  Otherwise, you could potentially preventDefault
>>> >> on any record that failed to be deleted and have your database in some
>>> type of inconsistent state.  Was that the intent?
>>> >>
>>> >> This is already the case. For example when a inserting a value into
>>> >> an object store the implementation might need to go update several
>>> >> indexes. Updating one of these indexes might result in the violation
>>> >> of a 'unique' constraint at which point all changes to all indexes as
>>> >> well as the change to the object store must be rolled back. However
>>> >> no other changes done as part of the transaction should be rolled
>>> >> back (unless the resulting "error" event isn't canceled).
>>> >>
>>> >> This is required in step 7 of the "Steps for asynchronously executing
>>> >> a request" (though I now see that it's missing in the "Steps for
>>> >> synchronously executing a request").
>>> >>
>>> >> dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html#steps-for-
>>> >> asynchronously-executing-a-request
>>> >>
>>> >> In the firefox implementation we create a "mini transaction" for each
>>> >> database request and if any part of the request fail we just roll
>>> >> back the "mini transaction".
>>> >
>>> > You are correct!  I forgot that we also do something similar.
>>> > So if we fail to remove any one record from the keyRange set, should we
>>> throw an InvalidStateError, UnknownError, other?
>>>
>>> I think for failed reads/writes like that we should use UnknownError.
>>> InvalidStateError indicates a fault on the side of the web page which isn't the
>>> case here.
>>>
>>> During our internal security review of IndexedDB we came to the conclusion
>>> that for IO errors we generally will not want to use more specific errors than
>>> UnknownError for fear of exposing sensitive information about the user's
>>> environment. At least unless there are clear use cases for it.
>>>
>>> >> >>In the async case, interpretation (1) matches Chrome's current behavior:
>>> >> success w/ null result if something was >deleted, error if there was
>>> >> nothing to delete. But I was about to land a patch to match the spec:
>>> >> success w/ true/false, >so this thread is timely.
>>> >> >
>>> >> > Our current implementation matches the spec, so we are okay keeping
>>> >> > it the
>>> >> way it is in the spec.  In either case, we need to figure out what to
>>> >> do when the partial deletion of keyRange records takes place.
>>> >>
>>> >> So you are not worried that this might make it impossible to do
>>> >> further optimizations in the future?
>>> >>
>>> >> Ultimately I'm ok either way. I don't have enough database experience
>>> >> to judge if we might loose out on important performance optimizations
>>> >> in the future. And I definitely think it's the case that if we make
>>> >> it return null then we'll force some people to do a .count() and then
>>> >> .delete() which is almost always slower.
>>> >>
>>> >> / Jonas
>>> >>
>>> >
>>> > We don't see this being a perf issue for our implementation.  While it could
>>> be an issue for some implementations, based on our experience we don't see
>>> this being a big perf hit.
>>>
>>> Ok, if no-one has performance concerns about this (which somewhat
>>> surprises me), then let's leave the spec as-is. I.e. returning true/false
>>> indicating if any records were deleted.
>>>
>>> Or should we return a count indicating how many records were deleted?
>>>
>>> / Jonas
>>
>> We believe count could potentially introduce a perf hit on the delete operation.  We see how this could have more dependencies on the backend DB.  While I can see how some users could care about the number of records that got deleted, our users/devs are more concerned about whether the operation was successful or not vs. the number of records that got deleted.
>>
>> To that effect, we rather keep it as a Boolean.
>
> Hmm.. why would getting an exact count when deleting a range be more
> expensive than getting an exact count when deleting a single key (the
> latter being required in order to return true/false as currently
> specified)?
>
> This isn't so much an argument for returning a count as it is an
> argument for not returning true/false :)

Based on discussions at TPAC, it sounded like microsoft were fine with
dropping the return value here and preferred to return undefined in
all situations for delete. If so, I'll go ahead and change the spec to
that effect.

Let me know if this is an ok interpretation. And also speak up if
anyone disagrees.

/ Jonas
Received on Monday, 7 November 2011 00:20:57 UTC