- From: Jonas Sicking <jonas@sicking.cc>
- Date: Thu, 1 Jul 2010 14:26:20 -0700
- To: ben turner <bent@mozilla.com>
- Cc: Jeremy Orlow <jorlow@chromium.org>, Webapps WG <public-webapps@w3.org>
Replying to get Ben's email to the list as it seems to be stuck in moderation... On Thu, Jul 1, 2010 at 1:38 PM, ben turner <bent@mozilla.com> wrote: > I would also point out that throwing exceptions at the call site makes > debugging much easier in my opinion. Our error events currently don't > include information like filename and line number where the failing > request was generated (though I think we should add that eventually). > Exceptions are much easier to track down, diagnose, and fix. > > -Ben > > On Thu, Jul 1, 2010 at 2:03 AM, Jonas Sicking <jonas@sicking.cc> wrote: >> On Wed, Jun 30, 2010 at 8:16 PM, Jeremy Orlow <jorlow@chromium.org> wrote: >>> I've thought about this more and have some additional doubts inline. >>> On Thu, Jul 1, 2010 at 11:55 AM, Jonas Sicking <jonas@sicking.cc> wrote: >>>> >>>> On Wed, Jun 30, 2010 at 6:42 PM, Jeremy Orlow <jorlow@chromium.org> wrote: >>>> > On Thu, Jul 1, 2010 at 11:17 AM, Jonas Sicking <jonas@sicking.cc> wrote: >>>> >> >>>> >> Hi All, >>>> >> >>>> >> Currently the IndexedDB specification is silent on what should happen >>>> >> if IDBObjectStore.add, IDBObjectStore.put, IDBObjectStore.remove, >>>> >> IDBCursor.update or IDBCursor.remove() is called from a READ_ONLY >>>> >> transaction. There are two possible ways we can handle this: >>>> >> >>>> >> 1. We can throw an exception. >>>> >> 2. We can return a IDBRequest object and asynchronously fire a 'error' >>>> >> event on this object. >>>> >> >>>> >> The advantage of 1 is that we pretty much know that this was an error >>>> >> due to a bug in the web page, >>> >>> I don't see why this is compelling. Many of the other errors that you still >>> propose we fire via the callback are due to bugs in the page. >>> >>>> >>>> >> and we can always know this >>>> >> synchronously without having to consult the database. >>> >>> So? Just because we can doesn't mean we should. >>> >>>> >>>> >> Throwing an >>>> >> error means that all the existing infrastructure for error handling >>>> >> with automatically kick in. For example any higher-level try/catch >>>> >> constructs will have an opportunity to catch the error. >>>> >> Implementations generally report uncaught exceptions to an error log. >>>> >> The browser will fire an 'error' event on the window which the page >>>> >> can use for further logging. Firing an error event on the other hand >>>> >> does not allow the browser to automatically log the error in a console >>>> >> as the page hasn't yet gotten a chance to handle it. >>> >>> Sure, but this doesn't help the majority of error conditions in IndexedDB. >>> It also ignores the cost of handling errors in 2 different ways. >> >> I'm arguing that people generally won't want to check for all types of >> errors, such as writing in a READ_ONLY transaction. Compare to the >> Node.appendChild function which throws for a number of conditions, yet >> I've never seen anyone put a "try" around it and check for >> HIERARCHY_REQUEST_ERR. >> >> Additionally, anything you do can likely in theory throw exceptions. >> For example if you specify the wrong number of arguments or call a >> function that doesn't exist, this will result in an exception. However >> you rarely see people wrap "try" around calls to try to handle this >> type of bugs. >> >> Another example is the line: >> >> db.objectStore("myObjectStore").get(53) >> >> can result in errors both being thrown and being reported as an error >> event. If the objectStore "myObjectStore" doesn't exist, then the >> .objectStore() call will return null and the js engine will throw due >> to trying to call a function on a null value. >> >> Error checking is mostly useful when there are ways you can actually >> handle the error. I think generally if someone calls .put during a >> READ_ONLY transaction, they simply have a bug in their program and >> there is little they can do in terms of "handling" that, short of >> simply logging it. >> >>>> >> The advantage of 2 is that this is consistent with other error >>>> >> conditions, such as writing duplicate keys, disk errors during writing >>>> >> the database to disk, internal errors in the database, etc. >>> >>> The other problem is that users then need 2 sets of error handling routines >>> for each call. Given how difficult it is to get web developers to do any >>> error checking, requiring 2 types of checks seems like a big downside. >> >> Actually, this is somewhat faulty math. With exceptions you can put a >> try/catch high up in a call stack to catch errors from multiple >> different locations. This will catch all errors thrown from all >> function calls inside the try. Compare that to error events, which has >> to be manually installed at every single request site. From this point >> of view, the more errors we move to being reported as exceptions, the >> more easily we're making it for developers to catch more errors. >> >> Javascript has a error reporting mechanism, exceptions. We should try >> to make good use of it. >> >>>> >> While consistency, and only needing to check for errors one way, is >>>> >> certainly good arguments, I would argue that people won't need to >>>> >> check for calling-add-on-read-only-transactions. For properly written >>>> >> code it's not an error that will occur, and thus there is no need to >>>> >> check for it. In fact, you probably are generally better off letting >>>> >> the exception bubble all the way up and get logged or caught by >>>> >> generic error handlers. >>> >>> These are awfully bold assumptions. Simply not catching the error is not an >>> option for many web applications or libraries. At very least, they'll need >>> to add finally statements to handle such cases. >> >> How would one handle the case of .put being called on a READ_ONLY transaction? >> >>>> >> Additionally, the structured clone algorithm, which defines that an >>>> >> exception should synchronously be thrown if the object is malformed, >>>> >> for example if it consists of a cyclic graph. So .add/.put/.update can >>>> >> already throw under certain circumstances. >>>> >> >>>> >> Also compare to if we were using a different API strategy of making >>>> >> objectStores and cursors returned from READ_ONLY transactions not have >>>> >> mutating functions. In this case if someone tried to call .put(), that >>>> >> also would result in a exception from the JS interpreter stating that >>>> >> you're calling a function that doesn't exist. >>>> >> >>>> >> So I would argue that we should throw for at least all transaction >>>> >> violations. I.e. whenever you try to perform an action not allowed by >>>> >> the current transaction. This would also cover the case of calling >>>> >> createObjectStore/removeObjectStore/createIndex/removeIndex during a >>>> >> non-setVersion-transaction. >>>> >> >>>> >> >>>> >> There is also another case where synchronously know that an error will >>>> >> be reported. We could throw when IDBCursor.update() is called when the >>>> >> underlying object store uses in-line keys and the property at the key >>>> >> path does not match the key in this cursor's position. In this case we >>>> >> similarly immediately know that there is an error without having to >>>> >> consult the database. We also generally can be sure that there is a >>>> >> bug in the web page which would benefit from being reported like other >>>> >> bugs are. >>>> >> >>>> >> And like stated above, IDBCursor.update() can already throw if the >>>> >> passed in object can't be structurally cloned. >>>> >> >>>> >> >>>> >> Jeremy previously asked if there was a test we could use to >>>> >> clearly/intuitively break error conditions into two groups. Ones that >>>> >> cause exceptions to be thrown, and ones that cause error events to be >>>> >> fired. I would say that errors that do not depend on what data is in >>>> >> the database, but rather are clearly due to errors at the call site >>>> >> should throw an exception. >>>> > >>>> > This would limit us in the future in terms of schema changes. The >>>> > current >>>> > async interface differs starting the transaction until the first call >>>> > that >>>> > accesses/modifies data (which are all async). If we ever allow a schema >>>> > change to happen without disconnecting all clients, it'd be possible >>>> > that >>>> > the objectStore could be deleted between when the call is made and when >>>> > the >>>> > transaction is actually allowed to start. >>>> >>>> I'm not quite following here. Even if we in the future allow >>>> objectStores to be deleted while there are transactions open against >>>> it, then .add/.put would still know if we're inside a READ_ONLY or >>>> READ_WRITE transaction, no? And so could still throw an error if we're >>>> in a READ_ONLY transaction. >>>> >>>> By the test defined above, .put would in that situation have to fire >>>> an error event, rather than throw, if properly called on an READ_WRITE >>>> transaction, but where the objectStore had been deleted. This because >>>> we would have to check with the database if the objectStore was >>>> deleted and thus would fail the "do not depend on what data is in the >>>> database" check. >>> >>> I don't see how the existence of an objectStore can be considered "data". >>> Sure an objectStore _contains_ data and that data would no longer be >>> accessible, but I don't see how the existence can be considered anything >>> other than meta data. >> >> It's a matter of definition I guess. So change the rule to be: >> >> I would say that errors that do not depend on what information is in >> the database, but rather are clearly due to errors at the call site >> should throw an exception. >> >> If we're changing the APIs to allow for objectStores to be removed at >> any time by another process, then the information of which >> objectStores actually exist would only live in the database. (Which is >> why I think that such an API would be a bad idea, but I realize of >> course that that isn't your point and that this is just a hypothetical >> example). >> >>>> > This also will limit what can be done on a background thread. For >>>> > example, >>>> > an implementation couldn't do serialization of the object on a >>>> > background >>>> > thread (yes, if you did this, you'd need to make sure the main thread >>>> > didn't >>>> > modify it until it finished serializing). >>>> >>>> For what it's worth, HTML5 already defines that the structured clone >>>> algorithm throws an exception, so that's not really something >>>> introduced by me in this thread. >>> >>> Is there any technical reason this has to throw rather than raising the >>> error through the callback? I don't see one. We already made the change to >>> WebKit so that we can opt out of throwing on a serialization error. >> >> Both Worker.postMessage and History.pushState are defined to throw >> exceptions when the structured clone algorithm fails. Despite both >> likely communicating with other threads or processes and thus could >> have benefitted from a theoretical off-main-thread serialization >> implementation. I have not heard of anyone complaining that this is a >> problem so far. >> >> Another reason to throw for an invalid object is that storing an >> invalid value is something I could actually imagine being possible to >> handle. For example code like: >> >> try { >> trans.objectStore("myStore").put(object); >> } >> catch (ex) { >> var backup = object.extraInfo; >> object.extraInfo = null; >> try { >> trans.objectStore("myStore").put(object); >> } >> finally { >> object.extraInfo = backup; >> } >> } >> >> This would be much harder to implement using error events, while >> keeping semantics exactly the same. For example you'd have to hope >> that 'object' doesn't change between the initial call to put and the >> error event firing. >> >>>> I also think that the problem you >>>> describe in parenthesis effectively prevents you from doing background >>>> serialization, at least without first copying so much data that you >>>> could also perform the constraints checks at the same time. >>> >>> Some implementations might be able to copy on write efficiently. You could >>> also put a lock around the data, lock it before main thread execution begins >>> again, and have the main thread take the lock before modifying the data. >>> This was also just one example of processing that could be done in the >>> background. And even if none of them are very compelling, keep in mind that >>> we're backing ourselves into a corner in terms of what we can add in the >>> future (without breaking this heuristic). >> >> Both these sound highly theoretical. I have not heard of any efficient >> scripting engine implementing either copy-on-write or >> grab-lock-on-write. It seems like a lot of overhead to for each write >> operation check if you need to execute copy or lock semantics. I'd >> like to hear from a JS engine implementation that this is something >> that is practically feasible before worrying too much about it. >> >> Also note that you'll still have to traverse the whole graph reachable >> from the passed in object in order to mark that these objects should >> have copy-on-write or grab-lock-on-write semantics. So while doing >> that, it's fairly little overhead to also verify that the object can >> be structurally cloned without spending the time actually serializing >> it. >> >> Yes, we're limiting ourselves a little bit. However it's a limitation >> we've accepted in other APIs (Worker.postMessage, History.pushState), >> and so far this hasn't seemed to be a problem. >> >> Additionally, for each API we create, we're always adding limitations >> for what we can do in the future. For example by defining that all >> objectStores can be iterated and have a defined order, we're >> preventing them from being implemented using hash tables. Of course we >> should be careful, but so far lazy cloning seems pretty theoretical. >> >>>> > Because of these reasons, I'm not too excited about this particular >>>> > heuristic for when to throw vs fire an error callback. I've thought >>>> > about >>>> > it a bit and can't think of anything better though, unfortunately. >>>> > I think I'm still slightly in favor of routing all errors through >>>> > onerror >>>> > callbacks and never throwing from a function that returns an IDBResult, >>>> > but >>>> > I think there were some good points brought up by Jonas for why throwing >>>> > on >>>> > some errors would make sense. >>>> >>>> I think HTML5 already forces us to make .put/.add/.update throw in >>>> certain circumstances. And I think the benefits that come with >>>> exception handling outweigh the theoretical possibility of performing >>>> the structured clone in a background thread. >>> >>> As I mentioned earlier in the thread, these were examples of small problems >>> in the current spec that I expect will really come back to bite us in the >>> future as we try to add more complex features. >> >> Other than the structured clone example, can you point to any other >> way that this could really bite us? >> >>> And I'm not convinced HTML5 forces us to do anything. >> >> I agree that that's true. >> >> / Jonas >> >> >
Received on Thursday, 1 July 2010 21:27:16 UTC