Re: [whatwg] Persistent and temporary storage

I'd like to share a use case and problem we have at Wikipedia with localStorage.

The MediaWiki software (which Wikipedia runs on) uses a framework called ResourceLoader for bundling and delivering modules to the client. [1][2]

Last year it was changed to make use of localStorage in addition to optimised HTTP 304 handling. Mainly because of two issues we found:

1. Batching is bad for incremental updates.

We combine requests for multiple modules in predictable batches. This means usually only 1 or 2 actual HTTP request are made for the main payload. However, when one of those module change in a deployment, that batch would no longer be same and have to be invalidated in its entirety. Causing the user to have to re-download all modules in the same batch as well. Extracting the payload client-side into individual modules put in LocalStorage allowed us to only re-request the module that changed from the server and evaluate the rest from localStorage (and update the entry afterward). This reduced bandwidth significantly and improved page load times overall.

I imagine HTTP2 might make it appropriate to phase out batches and just request modules individually (always) and let the network layer do the combining and separated caching in a more natural way.

2. HTTP 304 hits are not free.

We found that loading JS/CSS from LocalStorage was faster than hitting a HTTP 304. Making enough difference to justify this change.

So that went wrong?

Well. The nice thing about regular 304 caching is that as developers we're not worried about the size restriction of the store. Whether the browser limits this or not. Whether it's FIFO, LRU or just unlimited isn't an immediate user-visible concern (it probably should be, but that's for another discussion). When we started using localStorage, users that once visited pages with lots of functionality enabled found themselves having a full localStorage.

This caused other - more essential - functionality to no longer work. E.g. Logic that previously used cookies to store small state values that were moved to localStorage (to reduce network overhead and because it made semantic sense), such as "Boolean : Hide fundraising banner" or "Last 10 autocomplete values" – no longer worked as localStorage was filled up with our faux HTTP cache for ResourceLoader. Which is unfortunate, since the module store could easily fall back to requesting from HTTP (and usually hit 304) whereas those state values would never save and cause user-visible problems and functionality not working as expected.

We're working around it in different ways (some things resorted to cookies) but are still stalled on a long-term solution for this problem. We're considering to move our module store from localStorage to IndexedDB as that's not being used at the moment. It would provide the same separation as cookies/localStorage. In that localStorage would keep working even if IndexDB was full.

Some thoughts:

* A way to know if a url is cached or not (e.g. know whether a url will hit HTTP 304) without making the request.
* A way to prioritise which entries should be kept in localStorage and allow for low-prio entries to be evicted if short on space.
* A way to know how much localStorage is available in total.
* Perhaps a way to create a limited store within localStorage or IndexDB that has limited/restricted capacity (with some unique identifier, capacity percentage-based, or a min/max byte size?).
* A separate store for caching HTTP resources (the Service Worker's Cache API?)

— Timo Tijhof
Software Engineer
Wikimedia Foundation

PS: Sorry if this is the wrong avenue for this type of feedback. Thanks in advance.

[1] https://en.wikipedia.org/wiki/MediaWiki
[2] https://www.mediawiki.org/wiki/ResourceLoader/Features

On 13 Mar 2015, at 12:50, Anne van Kesteren <annevk@annevk.nl> wrote:

> A big gap with native is dependable storage for applications. I
> started sketching the problem space on this wiki page:
> 
>  https://wiki.whatwg.org/wiki/Storage
> 
> Feedback I got is that having some kind of allotted quota is useful
> for applications. That way they know how much they can put away.
> However, this clashes a bit with offering something that is
> competitive with native.
> 
> We can't really ask the user to divide up their storage. And yet when
> the user asks an application to store e.g. a whole bunch of music
> offline we don't really want the user agent to get in the way if the
> user already granted persistence.
> 
> 
> -- 
> https://annevankesteren.nl/

Received on Wednesday, 18 March 2015 00:38:59 UTC