- From: Philip Taylor <pjt47@cam.ac.uk>
- Date: Fri, 17 Oct 2008 17:53:16 +0100
- To: HTML WG <public-html@w3.org>
- CC: Ian Hickson <ian@hixie.ch>
Ian Hickson wrote:
> On Fri, 21 Mar 2008, Sunava Dutta wrote:
>>
>> Storage.remainingSpace
>>
>> A straightforward and popular request, this API provides a script to
>> check the remaining persistent storage spec available to it, in bytes.
>> It's a very useful feature to allow pages to manage their store better.
>>
>> * <Open Issue> We currently return bytes but perhaps returning the
>> number of characters is more useful? We'd love to hear thoughts here...
>
> The problem with this feature is that there are a number of ways to store
> data, and thus no way to know exactly how much data can be stored.
>
> For example, if the UA stores data in UTF-8 characters, the number of
> characters left to store will vary based on what characters are to be
> stored. Similarly, if the UA stores data in a compressed fashion, the
> number of bytes will vary based on how compressible the data is.
> [...]
> Thus this API really can't easily work in an interoperable fashion.
This seems like it could be a useful feature if it could be made to
work, so I'll try to propose the idea of a remainingSpacePercentage.
> [...] we don't want to preclude user agents from [...]
One additional thing we don't want to preclude is unlimited storage
space, e.g. a user might say a photo-manipulation web app should be
given as much space as it wants (until disk space runs out and the
browser dies and it corrupts all the configuration files it tries to
save while exiting, or whatever). That can't be handled nicely with
remainingSpace; it could be changed from int to float so it can be
Infinity, but that's a bit yucky.
> Furthermore, we don't want to preclude user agents from dynamically
> increasing the amount of available storage based on user actions, for
> example the UA could automatically increase the storage every time the
> user interacts with the page, or could prompt the user to increase the
> storage when it gets to 80%.
If, at any instant, storing a new value of some particular length will
cause an over-quota exception, then clearly there is a space limit at
that instant, so it's no different to a non-dynamically-sized storage
area. If it won't cause an exception (ignoring rare cases like running
out of physical disk space) regardless of its length, then the storage
can just be considered unlimited. So dynamic sizing doesn't seem like a
new problem, if the static and unlimited cases can already be handled.
I can think of two main use cases:
(1) Indicating to the user how much space is available, like in Gmail's
"You are currently using 153 MB (2%) of your 7204 MB", so they know
whether they need to delete some of their old data.
There are four pieces of information that might be relevant: the
bytes(/characters/etc) used, the bytes(/etc) remaining, the total
bytes(/etc) available, and the percentage used. The most useful for
humans is the percentage - I have no idea how many bytes a typical email
is, so I wouldn't be helped much by "You have 618KB remaining", but if I
see I'm only using 38% of the space after a few months then I know I
don't need to worry yet.
(2) Automatically cleaning up old/temporary data (e.g. caches) when
running out of space, to recover space for new data.
That cleanup could happen as late as possible (i.e. just as you're about
to store new data which doesn't fit), in which case the current setItem
out-of-space exception seems adequate - you can wrap setItem in a
function that tries to set, catches the exception, cleans up the cache
and then tries again.
Or it could happen at some earlier time, e.g. when the user is idle and
won't mind a bit of a pause while you clean up old data. That behaviour
could be very application-dependent: it's determined by how big the
caches are, how much data will be saved, how much space needs to be made
available, how often the cleanup process will run, etc. Or it could be
quite simple: if free space drops below 5%, expire old data until
there's none left or free space reaches 15%. I don't know what people
would want in practice, so I'll hope the latter is adequate.
I'm sure there must be other cases, but I don't know what they are.
(What were the specific use cases that prompted IE to add remainingSpace?)
Then, some possible solutions:
~ ~ ~ ~
No API; the UA can just provide UI to view the available/used storage
space for the current domain.
Pros:
* Maximally simplifies API.
* Prevents authors abusing the API and causing non-interoperability
problems.
Cons:
* Most users won't have any idea how to access that UI. Sites that
want users to know how much space they're using shouldn't be forced to
give instructions like "If you are using Firefox, open the Tools menu
and click the whatever button etc. If you are using IE, open the ..."
because that's horrid.
* Doesn't help the cache-cleanup use case.
~ ~ ~ ~
As before, but with a <bb type=managestoragequota>.
Pros:
* Same API considerations as before.
* Lets pages make the storage UI discoverable to users (e.g. popping
up a dialog box saying "you only have space for 10 more emails, _click_
_here_ to allocate more storage space").
Cons:
* Requires the user to click a link before being able to see their
quota, which is not a good user experience.
* Doesn't help the cache-cleanup use case.
~ ~ ~ ~
remainingSpace API, exactly like what IE8b2 does:
Calculates space from the sum of key lengths plus value lengths,
measured in UTF-16 code units (i.e. non-BMP characters count as 2).
Pros:
* Allows pages to present some kind of storage space status to the user.
* Allows pages to accurately predict whether enough space will be left
after storing some more stuff in the future, so they can tidy up caches
and expire old data until happiness ensues.
Cons:
* Doesn't make it easy for pages to present particularly useful
storage space status - you can't tell how much space is available in
total (except by manually summing the lengths of all your stored data),
so you can't give a usage percentage.
* Doesn't correspond to physical storage space used (e.g. IE8b2's
nominal ~5M character limit lets me store a million two-character keys
and take up 60MB of file space because of the overhead in the storage
format), which is particularly bad for resource-constrained devices
where there's a hard limit on physical storage space. So I expect many
browsers would be unwilling to implement it exactly like this.
* Doesn't handle unlimited storage gracefully.
~ ~ ~ ~
remainingSpace API, but with no strict definition so browsers can
measure whatever they want:
Pros:
* Allows browsers to report and limit the physical storage space used
(regardless of character encoding, compression, etc), instead of only
being allowed to limit key/value characters.
Cons:
* Will be very different between implementations, so pages are quite
likely to rely on non-interoperable details (e.g. assuming that if
remainingStorage >= 100 then they can safely store anything where
key.length+value.length <= 39 and don't need to check for out-of-space
exceptions; or assuming that if remainingStorage = 1e6 then they can
tell the user there's space for about a thousand 1KB images; or various
other plausible situations).
~ ~ ~ ~
remainingSpacePercentage API:
Returns the approximate percentage of space remaining.
If there is an unlimited quota, it must return 100.
Otherwise, it must return an integer value between 1 and 99 (inclusive).
(It intentionally avoids 0, to prevent people mistakenly thinking they
can tell when the storage area is completely full.)
It should return a value that decreases linearly in the amount of data
stored, but isn't required to (e.g. it could go up when you add new
data, because maybe it suddenly compresses much better than before).
(The name is too long so it should probably be renamed, and maybe it
should be switched from 'remaining' to 'used'.)
Pros:
* Allows pages to present storage space status to the user in whatever
way they feel is appropriate (e.g. text ("You're using 38% of your local
storage space"), graphical progress bar, flashing warnings when 98%
full, etc).
* Allows pages to trigger automatic cleanups when the usage exceeds
some threshold.
* Discourages the abuse cases where pages might depend on non-portable
implementation details, e.g. it doesn't tell them how many
bytes/characters/etc are available and it's too imprecise for them to
try to calculate the total.
* Handles unlimited storage in a way that lets sophisticated pages
detect it and present it nicely to users, and lets dumb pages ignore it
entirely and treat it just like limited storage and it'll still act
sensibly.
Cons:
* Doesn't allow pages to know how many bytes are available, so:
* They can't tell a technically-knowledgeable user how many bytes are
available (but the user can still use their browser UI if they really
want to check).
* They can't clean up caches in advance of out-of-space errors with
much accuracy, since they don't know how much space is really available.
* Still provides ways for pages to be non-interoperable, e.g. they
could save a megabyte of data and see how the percentage changes to work
out the approximate total amount of space available and extrapolate from
that. But that's a bit obscure, and I can't think of any
obvious-but-wrong cases that people are likely to write accidentally.
~ ~ ~ ~
Have I missed many significant details and issues here?
--
Philip Taylor
pjt47@cam.ac.uk
Received on Friday, 17 October 2008 16:53:51 UTC