[whatwg] Web Storage: apparent contradiction in spec

Not convinced. :)

1. Analogies

The analogy was made comparing a user agent that purges local storage to an
OS throwing out files without explicit user action. This is misleading since
most files arrive on your computer's disk via explicit user action. You copy
files to your disk by downloading them from the internet, copying from a
network drive, from a floppy, your camera, etc. You put them on your disk
and you are responsible for removing them to reclaim space.

There are apps that create files in hidden places such as:

C:\Documents and Settings\linus\Local Settings\Application
Data\Google\Chrome\User Data

If those apps do not manage their space carefully, users get annoyed. If
such an app filled the user's disk they would have no idea what consumed the
space or how to reclaim it. They didn't put the files there. How are they
supposed to know to remove them? Most users have no idea that Local Settings
exists (it is hidden), much less how to correctly manage any files they
find.

A better analogy would be, "What if watching TV caused 0-5MB size files to
silently be created from time to time in a hidden folder on your computer,
and when your disk filled up both your TV and computer stopped working?"

Lengthy discussion on cleaning up hidden resources (persistent background
content) here:
http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-July/021421.html

2. Attack

Without automatic space management the local storage consumed will grow
without bound. I'm concerned that even without an intentional DOS attack
users are going to be unhappy about their shrinking disks and not know what
to do about it. The problem is worse on phones.

Things get worse still if a griefer wants to make a point about the
importance of keeping web browsers logically stateless. Here's how such an
attack could be carried out:

2a. Acquire a bunch of unrelated domains from a bunch of registrars using
stolen credit cards. Skip this step if UAs don't group subdomains under the
same storage quota. For extra credit pick names that are similar to
legitimate sites that use local storage.

2b. Start up some web hosting accounts. Host your attack code here. If they
aren't free, use stolen credit cards.

2c. Buy ads from a network that subsyndicates from a network that
subsyndicates from a major ad network that allows 3rd party ad serving.
There are lots to choose from. No money? Stolen credit cards. Serve the ads
from your previously acquired hosting accounts.

2d. Giggle. The user will be faced with the choice of writing off the space,
deleting everything including their precious data, or carefully picking
though tens of thousands of entries to find the few domains that hold
precious content. User gets really unhappy if the attack managed to fill the
disk.

3. Ingcognito / Private Browsing

Chrome's Incognito mode creates a temporary, in-memory profile. Local
storage operations will work, but nothing will be saved after the Incognito
window is closed. Safari takes a different approach and causes local storage
operations to fail when in Private Browsing mode. Some sites won't work in
Private Browsing. I don't recall what Firefox or IE do. Pick your poison.

Lengthy discussion here:
http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-April/019238.html

4. Cache Eviction Algorithms

At a minimum the HTML 5 spec should be silent on how user agents implement
local storage policies. I would prefer the spec to make it clear that local
storage is a cache, domains can use up to 5MB of space without interrupting
the user, and that UAs were free to implement varying cache eviction
algorithms.

Some browsers may provide interface to allow users to specify precious local
storage, some may not. Eviction policies for installed extensions may be
different than those for web pages. Quotas for extensions may be different
than that for web pages. Non-browser UAs such as Dashboard, AIR, etc. may
have different policies.

If the spec requires UAs to maintain local storage as 'precious' it will be
the first such feature in HTML 5. Everything else in the spec is treated as
volatile.

Linus


On Tue, Aug 25, 2009 at 4:36 PM, Jeremy Orlow <jorlow at chromium.org> wrote:

> On Tue, Aug 25, 2009 at 4:18 PM, Brady Eidson <beidson at apple.com> wrote:
>
>>
>> On Aug 25, 2009, at 3:51 PM, Jeremy Orlow wrote:
>>
>> On Tue, Aug 25, 2009 at 3:19 PM, Aaron Boodman <aa at google.com> wrote:
>>
>>> On Tue, Aug 25, 2009 at 2:44 PM, Jeremy Orlow<jorlow at chromium.org>
>>> wrote:
>>>
>>> Extensions are an example of an application that is less cloud-based.
>>> It would be unfortunate and weird for extension developers to have to
>>> worry about their storage getting tossed because the UA is running out
>>> of disk space.
>>
>>
>> Extensions are pretty far out of scope of the spec (at least for now),
>> right?  (Within Chrome, we can of course special case this.)
>>
>>
>> The current spec is about "Web Applications" of all forms, including those
>> that are offline, and others that hope to break from from the *required*
>> chain to the cloud.
>>
>> Extensions based on web technology are just one form of this.
>>  Widgets/gadgets are another.  Stand alone web applications are yet another.
>>  Native applications that integrate HTML for their UI are another still.
>>
>> On Tue, Aug 25, 2009 at 3:09 PM, Jens Alfke <snej at google.com> wrote:
>>
>>> Interesting comments. Linus and Jeremy appear to be coming at this from a
>>> pure "cloud" perspective, where any important or persistent data is kept on
>>> a remote server and the browser, so local storage can be treated as merely a
>>> cache. That's definitely a valid position, but from my perspective, much of
>>> the impetus for having local storage is to be able to support *other* application
>>> models, where important data is stored locally. If browsers are free to
>>> dispose HTML5 local storage without the user's informed consent, such
>>> applications become dangerously unreliable.
>>> For example, Linus wrote:
>>>
>>> User agents need to be free to garbage collect any local state. If they
>>> can't then attackers (or the merely lazy) will be able to fill up the user's
>>> disk. We can't expect web sites or users to do the chore of taking out the
>>> garbage.
>>>
>>>
>>> Replace "user agent" -> "operating system" and "local state" -> "user
>>> files", and you have an argument that, when the hard disk in my MacBook gets
>>> too full, the OS should be free to start randomly deleting my local files to
>>> make room. This would be a really bad idea.
>>>
>>
>> Well, it's certainly different from what we're used to.  I'm not convinced
>> it's wrong though.  The web has gotten by pretty well with such a model so
>> far.
>>
>>
>> But behind the scenes, developers have shoehorned their own data storage
>> solutions in to place because there hasn't been a good solution in place.
>>
>> Why should an app that is largely about client side experience have to
>> store user preferences in cookies and hope they won't be purged, or load a
>> plug-in that has reliable local storage, or sync preferences over the cloud
>> to a server?
>>
>>
>>  Similar analogies ?
>>> ? If the SD card in my Wii fills up, should the system automatically
>>> start deleting saved games?
>>> ? If my iPhone's Flash disk gets full, should it start deleting photos?
>>> What if I haven't synced those photos to my iTunes yet?
>>>
>>> In each of those cases, what the device actually does is warns you about
>>> the lack of free space, and lets you choose what to get rid of.
>>>
>>
>> It's worth noting that today, OSs do a pretty poor job of helping you with
>> this task.  (I don't see any reason why the spec will prohibit UAs from
>> creating a good UI for this, though.)
>>
>>
>> I completely agree OSs do a pretty poor job of helping with the task.
>>  Browsers might be an innovating space here.  I challenge you to come up
>> with a great UI for this that shows up in a UA.  I challenge the WHATWG to
>> not decide that deleting user data is okay because it's the easiest way
>> out.
>>
>>
>>
>>> Local storage is different from cloud storage. The HTML5 storage API can
>>> be used for both, so it shouldn't be limited to what's convenient for just
>>> one of them.
>>>
>>
>> I still don't understand what use local storage has outside of 'cloud
>> storage'.  Even in the extensions use case (which I think is out of scope
>> for this spec), there's no reason you can't sync user preferences and such
>> to the cloud.
>>
>>
>> Once thing I think that HTML5 has made clear is that "web technologies"
>> are no longer exclusively about "web sites" that exist solely "in the
>> cloud."  Widgets/gadgets, html-based extensions, offline web applications,
>> and native applications that use HTML/JS/CSS to embed parts of their UI are
>> *all* covered by HTML5, and I don't think requiring the cloud for any of
>> them is necessary.
>>
>> Also - and I don't mean this to be flippant, I raise it as a serious point
>> - not all web application developers are Google or Apple with access to a
>> server infrastructure.  To many web developers, just "throwing data up on a
>> server somewhere" is outside the constraints of their resources or their
>> design.
>>
>> The cloud is within the scope of web technologies, but web technologies
>> should not rely on the cloud.
>>
>> By the way, in case it's not clear, my position is not that UAs should
>> take deleting user information lightly, my position is 1) this behavior
>> should be left up to the UA and 2) when possible, developers should keep
>> information in the cloud not local storage.
>>
>>
>> Don't take my hyperboles too seriously - I don't *seriously* think that
>> anyone is suggested browsers should be light hearted about deleting user
>> data.
>> I think our positions are all pretty clear, and it's coming down to this
>> differing philosophy.
>>
>> But my position is:
>> 1) If this behavior is left up to the UA, then developers who are using
>> web technologies to write applications and they don't want to have to worry
>> about "the cloud" for their data are out of luck, because *one* browser that
>> is willing to delete data behind the scenes ruins their reliable picture of
>> the technology.
>> 2) Developers should not be *forced* into using the cloud when it is not
>> within the scope of what they're trying to develop.
>>
>
> I still think there are non-trivial downsides to treating local storage
> (and presumable database) data as "sacred", but I guess I'm convinced that
> it's the correct way to go.  I hate throwing UI at problems, but it really
> does make a good deal of sense in this case.  Heuristics will still be
> helpful in that we can suggest what to delete to the user, but I guess the
> user should always make the final decision.
>
> Linus, are you convinced?
>
> J
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090826/262b4955/attachment-0001.htm>

Received on Wednesday, 26 August 2009 16:01:49 UTC