[w3c/ServiceWorker] Mixed feelings about ServiceWorker for module loader (#1203)

Here's a brief story about past attempts at considering use of ServiceWorker in Wikimedia's module loader.

### Current
I'll limit details to just the storing and fetching of modules.  For more, see <https://www.mediawiki.org/wiki/ResourceLoader/Features>.

The client has access to a manifest consisting of a dependency tree and version hashes. The page then instructs the client to load a list of modules for the current page.

Before touching the network, it first checks `localStorage` for blobs matching the requested keys (moduleName/moduleVersion). Matches are retrieved, parsed and scheduled for global-eval via `requestIdleCallback`. Any remaining modules that weren't cached (or version mismatch) are requested in a single batch request from the network. The server responds by providing each module in a way that allows the client to orchestrate execution (honouring the dependency tree) and stashes a copy in `localStorage` for future use.

### Why

_See also <https://meta.wikimedia.org/wiki/Research:Module_storage_performance>._

The main reason why, is to reduce wasted network bandwidth by creating "perfect cache fragmentation". We previously tried to maintain cache groups manually (typical "common", "app", "vendor", "page-specific" cache groups) which would inform the client to put certain modules in a separate request to improve chances of a client-side cache hits when navigating between different pages. But the majority of our modules are page specific, and some common (but small) modules chance frequently. Whenever one changes, the client would re-download the batch, despite a cache entry that was still valid for the most part. Unpacking the batch response client-side and using them directly has solved this problem. We also disabling batching requests in general (both prior to HTTP/2, and more recently), but we found similar findings as published elsewhere in the industry: the overhead is non-trivial and doesn't make for a net-win.

### Problem

Space in `localStorage` is precious and its traditional use (small values for state management, e.g. remembering state of a collapsible element) is hampered in a way that is increasingly hard to cope with. Especially due to the lack of any eviction mechanism (eg. TTL/LRU). In fact, we've had to disable our client-store in Firefox because our more active users that visit multiple *.wikipedia.org domains had their localStorage full in no time, at which point any other attempt to write data would fail, with no sensible way to clear it. ([T66721](https://phabricator.wikimedia.org/T66721))

### ServiceWorker

The first attempt to leverage SW was to basically disable the logic in the main thread, and just have it send a network request for all modules at once. The SW thread then does what the main thread used to do: Split the batch url into one-off urls, check the Cache API for each would-be request url, hold on to the responses while making a request for any cache misses, then once it comes back, respond to the original request with a concatenated response of both the cache hits and the network request.

Two regressions:
* Delay: Previously cache hits execute immediately, and only cache misses wait for the network. With the above, nothing executes until the last module comes back from the network because even if we add streaming logic and if browsers parse JavaScript programs in a streaming way, execution won't happen until the entire response is received.
* Overhead: Previously each cache hit was executed separately, thus letting requestIdleCallback control combination or distribution of execution in a friendly way. The network request still comes back at an undefined point with a potentially large payload, but with the above everything would always be combined and execute in one large uninterruptible burst.

### Cache API

After considering different ways for the above, I think the next-best thing is to consider using the Cache API directly from the main thread and basically just drop it in as a direct (async) replacement for localStorage. We'd get and set keys into it directly, and keep using `eval()`.

Thoughts?



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/ServiceWorker/issues/1203

Received on Thursday, 5 October 2017 19:17:07 UTC