Re: [IndexedDB] IDBObjectStore.delete should accept a KeyRange from Jonas Sicking on 2011-10-28 (public-webapps@w3.org from October to December 2011)

From: Jonas Sicking <jonas@sicking.cc>
Date: Fri, 28 Oct 2011 09:55:45 -0700
To: Israel Hilerio <israelh@microsoft.com>
Cc: Joshua Bell <jsbell@chromium.org>, "public-webapps@w3.org" <public-webapps@w3.org>, Adam Herchenroether <aherchen@microsoft.com>, Victor Ngo <vicngo@microsoft.com>, Jim Wordelman <jaword@microsoft.com>
Message-ID: <CA+c2ei_1eMoJSWQ6vtLKLaaUDBeT7Lq=bmVLtGUSwj1mGTotsQ@mail.gmail.com>
On Fri, Oct 28, 2011 at 9:29 AM, Israel Hilerio <israelh@microsoft.com> wrote:
> On Thursday, October 27, 2011 6:00 PM, Jonas Sicking wrote:
>> On Thu, Oct 27, 2011 at 9:33 AM, Israel Hilerio <israelh@microsoft.com>
>> wrote:
>> > On Wednesday, October 26, 2011 6:38 PM, Jonas Sicking wrote:
>> >> On Wed, Oct 26, 2011 at 11:31 AM, Israel Hilerio
>> >> <israelh@microsoft.com>
>> >> wrote:
>> >> > On Wednesday, October 26, 2011 9:35 AM, Joshua Bell wrote:
>> >> >>On Tue, Oct 25, 2011 at 4:50 PM, Israel Hilerio
>> >> >><israelh@microsoft.com>
>> >> wrote:
>> >> >>>On Monday, October 24, 2011 7:40 PM, Jonas Sicking wrote:
>> >> >>>>
>> >> >>>> While I was there it did occur to me that the fact that the
>> >> >>>> .delete function "returns" (through request.result in the async
>> >> >>>> API) true/false depending on if any records were removed or not
>> >> >>>> might be
>> >> bad for performance.
>> >> >>>>
>> >> >>>> I suspect that this highly depends on the implementation and
>> >> >>>> that in some implementations knowing if records were deleted
>> >> >>>> will be free and in others it will be as costly as a .count()
>> >> >>>> and then a .delete(). In yet others it could depend on if a
>> >> >>>> range, rather than a key, was used, or if the objectStore has
>> >> >>>> indexes which might need
>> >> updating.
>> >> >>>>
>> >> >>>> Ultimately I don't have a strong preference either way, though
>> >> >>>> it seems unfortunate to slow down implementations for what
>> >> >>>> likely is a
>> >> rare use case.
>> >> >>>>
>> >> >>>> Let me know what you think.
>> >> >>>>
>> >> >>>> / Jonas
>> >> >>>>
>> >> >
>> >> >>>To clarify, removing the return value from the sync call would
>> >> >>>change its
>> >> return signature to void.  In this case, >>successfully returning
>> >> from the IDBObjectStore.delete call would mean that the information
>> >> was successfully
>> >> >>deleted, correct?  If the information was not successfully deleted,
>> >> >>would we
>> >> throw an exception?
>> >> >>>
>> >> >>>In the async case, we would keep the same return value of
>> >> >>>IDBRequest for
>> >> IDBObjectStore.delete.  The only change is >>that the request.result
>> >> would be null, correct?  If no information is deleted or if part of
>> >> the keyRange data is deleted, >>should we throw an error event?  It seems
>> reasonable to me.
>> >> >
>> >> >>When you write "If no information is deleted ... should we throw an
>> >> >>error
>> >> event?" do you mean (1) there was no >matching key so the delete was
>> >> a no- op, or (2) there was a matching key but an internal error
>> >> occurred preventing the >delete? I ask because the second clause, "if
>> >> part of the keyRange data is deleted, should we throw an error
>> >> event?" >doesn't make sense to me in interpretation (1) since I'd expect
>> sparse ranges in many cases.
>> >> >
>> >> > I was originally referring to (1) and (2).  However, after
>> >> > discussing this with
>> >> a couple of folks we believe that the better approach would be to:
>> >> > * continue to return true or false in the result.  This will take
>> >> > care of (1) and
>> >> the successful deletion of all records.
>> >> > * publish an error event if (2).  What I meant by (2) is that if
>> >> > there was a
>> >> successful set of matches that were able to be returned by the
>> >> keyRange, we should guarantee the deletion of all the matches or none.
>> >> >
>> >> > However, (2) brings up a bigger issue.  We are basically saying
>> >> > that if we
>> >> support deletion of keyRanges we are guaranteeing that the batch
>> >> operation will all happen or none of it will happen.  This implies
>> >> some type of inner- transaction associated only with the delete
>> >> operation, which could also be rolledback as part of the
>> >> outer-transaction.  Otherwise, you could potentially preventDefault
>> >> on any record that failed to be deleted and have your database in some
>> type of inconsistent state.  Was that the intent?
>> >>
>> >> This is already the case. For example when a inserting a value into
>> >> an object store the implementation might need to go update several
>> >> indexes. Updating one of these indexes might result in the violation
>> >> of a 'unique' constraint at which point all changes to all indexes as
>> >> well as the change to the object store must be rolled back. However
>> >> no other changes done as part of the transaction should be rolled
>> >> back (unless the resulting "error" event isn't canceled).
>> >>
>> >> This is required in step 7 of the "Steps for asynchronously executing
>> >> a request" (though I now see that it's missing in the "Steps for
>> >> synchronously executing a request").
>> >>
>> >> dvcs.w3.org/hg/IndexedDB/raw-file/tip/Overview.html#steps-for-
>> >> asynchronously-executing-a-request
>> >>
>> >> In the firefox implementation we create a "mini transaction" for each
>> >> database request and if any part of the request fail we just roll
>> >> back the "mini transaction".
>> >
>> > You are correct!  I forgot that we also do something similar.
>> > So if we fail to remove any one record from the keyRange set, should we
>> throw an InvalidStateError, UnknownError, other?
>>
>> I think for failed reads/writes like that we should use UnknownError.
>> InvalidStateError indicates a fault on the side of the web page which isn't the
>> case here.
>>
>> During our internal security review of IndexedDB we came to the conclusion
>> that for IO errors we generally will not want to use more specific errors than
>> UnknownError for fear of exposing sensitive information about the user's
>> environment. At least unless there are clear use cases for it.
>>
>> >> >>In the async case, interpretation (1) matches Chrome's current behavior:
>> >> success w/ null result if something was >deleted, error if there was
>> >> nothing to delete. But I was about to land a patch to match the spec:
>> >> success w/ true/false, >so this thread is timely.
>> >> >
>> >> > Our current implementation matches the spec, so we are okay keeping
>> >> > it the
>> >> way it is in the spec.  In either case, we need to figure out what to
>> >> do when the partial deletion of keyRange records takes place.
>> >>
>> >> So you are not worried that this might make it impossible to do
>> >> further optimizations in the future?
>> >>
>> >> Ultimately I'm ok either way. I don't have enough database experience
>> >> to judge if we might loose out on important performance optimizations
>> >> in the future. And I definitely think it's the case that if we make
>> >> it return null then we'll force some people to do a .count() and then
>> >> .delete() which is almost always slower.
>> >>
>> >> / Jonas
>> >>
>> >
>> > We don't see this being a perf issue for our implementation.  While it could
>> be an issue for some implementations, based on our experience we don't see
>> this being a big perf hit.
>>
>> Ok, if no-one has performance concerns about this (which somewhat
>> surprises me), then let's leave the spec as-is. I.e. returning true/false
>> indicating if any records were deleted.
>>
>> Or should we return a count indicating how many records were deleted?
>>
>> / Jonas
>
> We believe count could potentially introduce a perf hit on the delete operation.  We see how this could have more dependencies on the backend DB.  While I can see how some users could care about the number of records that got deleted, our users/devs are more concerned about whether the operation was successful or not vs. the number of records that got deleted.
>
> To that effect, we rather keep it as a Boolean.

Hmm.. why would getting an exact count when deleting a range be more
expensive than getting an exact count when deleting a single key (the
latter being required in order to return true/false as currently
specified)?

This isn't so much an argument for returning a count as it is an
argument for not returning true/false :)

/ Jonas
Received on Friday, 28 October 2011 16:56:45 UTC