- From: Philip Taylor <pjt47@cam.ac.uk>
- Date: Fri, 17 Oct 2008 17:53:16 +0100
- To: HTML WG <public-html@w3.org>
- CC: Ian Hickson <ian@hixie.ch>
Ian Hickson wrote: > On Fri, 21 Mar 2008, Sunava Dutta wrote: >> >> Storage.remainingSpace >> >> A straightforward and popular request, this API provides a script to >> check the remaining persistent storage spec available to it, in bytes. >> It's a very useful feature to allow pages to manage their store better. >> >> * <Open Issue> We currently return bytes but perhaps returning the >> number of characters is more useful? We'd love to hear thoughts here... > > The problem with this feature is that there are a number of ways to store > data, and thus no way to know exactly how much data can be stored. > > For example, if the UA stores data in UTF-8 characters, the number of > characters left to store will vary based on what characters are to be > stored. Similarly, if the UA stores data in a compressed fashion, the > number of bytes will vary based on how compressible the data is. > [...] > Thus this API really can't easily work in an interoperable fashion. This seems like it could be a useful feature if it could be made to work, so I'll try to propose the idea of a remainingSpacePercentage. > [...] we don't want to preclude user agents from [...] One additional thing we don't want to preclude is unlimited storage space, e.g. a user might say a photo-manipulation web app should be given as much space as it wants (until disk space runs out and the browser dies and it corrupts all the configuration files it tries to save while exiting, or whatever). That can't be handled nicely with remainingSpace; it could be changed from int to float so it can be Infinity, but that's a bit yucky. > Furthermore, we don't want to preclude user agents from dynamically > increasing the amount of available storage based on user actions, for > example the UA could automatically increase the storage every time the > user interacts with the page, or could prompt the user to increase the > storage when it gets to 80%. If, at any instant, storing a new value of some particular length will cause an over-quota exception, then clearly there is a space limit at that instant, so it's no different to a non-dynamically-sized storage area. If it won't cause an exception (ignoring rare cases like running out of physical disk space) regardless of its length, then the storage can just be considered unlimited. So dynamic sizing doesn't seem like a new problem, if the static and unlimited cases can already be handled. I can think of two main use cases: (1) Indicating to the user how much space is available, like in Gmail's "You are currently using 153 MB (2%) of your 7204 MB", so they know whether they need to delete some of their old data. There are four pieces of information that might be relevant: the bytes(/characters/etc) used, the bytes(/etc) remaining, the total bytes(/etc) available, and the percentage used. The most useful for humans is the percentage - I have no idea how many bytes a typical email is, so I wouldn't be helped much by "You have 618KB remaining", but if I see I'm only using 38% of the space after a few months then I know I don't need to worry yet. (2) Automatically cleaning up old/temporary data (e.g. caches) when running out of space, to recover space for new data. That cleanup could happen as late as possible (i.e. just as you're about to store new data which doesn't fit), in which case the current setItem out-of-space exception seems adequate - you can wrap setItem in a function that tries to set, catches the exception, cleans up the cache and then tries again. Or it could happen at some earlier time, e.g. when the user is idle and won't mind a bit of a pause while you clean up old data. That behaviour could be very application-dependent: it's determined by how big the caches are, how much data will be saved, how much space needs to be made available, how often the cleanup process will run, etc. Or it could be quite simple: if free space drops below 5%, expire old data until there's none left or free space reaches 15%. I don't know what people would want in practice, so I'll hope the latter is adequate. I'm sure there must be other cases, but I don't know what they are. (What were the specific use cases that prompted IE to add remainingSpace?) Then, some possible solutions: ~ ~ ~ ~ No API; the UA can just provide UI to view the available/used storage space for the current domain. Pros: * Maximally simplifies API. * Prevents authors abusing the API and causing non-interoperability problems. Cons: * Most users won't have any idea how to access that UI. Sites that want users to know how much space they're using shouldn't be forced to give instructions like "If you are using Firefox, open the Tools menu and click the whatever button etc. If you are using IE, open the ..." because that's horrid. * Doesn't help the cache-cleanup use case. ~ ~ ~ ~ As before, but with a <bb type=managestoragequota>. Pros: * Same API considerations as before. * Lets pages make the storage UI discoverable to users (e.g. popping up a dialog box saying "you only have space for 10 more emails, _click_ _here_ to allocate more storage space"). Cons: * Requires the user to click a link before being able to see their quota, which is not a good user experience. * Doesn't help the cache-cleanup use case. ~ ~ ~ ~ remainingSpace API, exactly like what IE8b2 does: Calculates space from the sum of key lengths plus value lengths, measured in UTF-16 code units (i.e. non-BMP characters count as 2). Pros: * Allows pages to present some kind of storage space status to the user. * Allows pages to accurately predict whether enough space will be left after storing some more stuff in the future, so they can tidy up caches and expire old data until happiness ensues. Cons: * Doesn't make it easy for pages to present particularly useful storage space status - you can't tell how much space is available in total (except by manually summing the lengths of all your stored data), so you can't give a usage percentage. * Doesn't correspond to physical storage space used (e.g. IE8b2's nominal ~5M character limit lets me store a million two-character keys and take up 60MB of file space because of the overhead in the storage format), which is particularly bad for resource-constrained devices where there's a hard limit on physical storage space. So I expect many browsers would be unwilling to implement it exactly like this. * Doesn't handle unlimited storage gracefully. ~ ~ ~ ~ remainingSpace API, but with no strict definition so browsers can measure whatever they want: Pros: * Allows browsers to report and limit the physical storage space used (regardless of character encoding, compression, etc), instead of only being allowed to limit key/value characters. Cons: * Will be very different between implementations, so pages are quite likely to rely on non-interoperable details (e.g. assuming that if remainingStorage >= 100 then they can safely store anything where key.length+value.length <= 39 and don't need to check for out-of-space exceptions; or assuming that if remainingStorage = 1e6 then they can tell the user there's space for about a thousand 1KB images; or various other plausible situations). ~ ~ ~ ~ remainingSpacePercentage API: Returns the approximate percentage of space remaining. If there is an unlimited quota, it must return 100. Otherwise, it must return an integer value between 1 and 99 (inclusive). (It intentionally avoids 0, to prevent people mistakenly thinking they can tell when the storage area is completely full.) It should return a value that decreases linearly in the amount of data stored, but isn't required to (e.g. it could go up when you add new data, because maybe it suddenly compresses much better than before). (The name is too long so it should probably be renamed, and maybe it should be switched from 'remaining' to 'used'.) Pros: * Allows pages to present storage space status to the user in whatever way they feel is appropriate (e.g. text ("You're using 38% of your local storage space"), graphical progress bar, flashing warnings when 98% full, etc). * Allows pages to trigger automatic cleanups when the usage exceeds some threshold. * Discourages the abuse cases where pages might depend on non-portable implementation details, e.g. it doesn't tell them how many bytes/characters/etc are available and it's too imprecise for them to try to calculate the total. * Handles unlimited storage in a way that lets sophisticated pages detect it and present it nicely to users, and lets dumb pages ignore it entirely and treat it just like limited storage and it'll still act sensibly. Cons: * Doesn't allow pages to know how many bytes are available, so: * They can't tell a technically-knowledgeable user how many bytes are available (but the user can still use their browser UI if they really want to check). * They can't clean up caches in advance of out-of-space errors with much accuracy, since they don't know how much space is really available. * Still provides ways for pages to be non-interoperable, e.g. they could save a megabyte of data and see how the percentage changes to work out the approximate total amount of space available and extrapolate from that. But that's a bit obscure, and I can't think of any obvious-but-wrong cases that people are likely to write accidentally. ~ ~ ~ ~ Have I missed many significant details and issues here? -- Philip Taylor pjt47@cam.ac.uk
Received on Friday, 17 October 2008 16:53:51 UTC