[Beacon] Off-line behavior, captive portals and error recovery from Kornel Lesinski on 2014-05-14 (public-web-perf@w3.org from May 2014)

From: Kornel Lesinski <kornel@geekhood.net>
Date: Wed, 14 May 2014 21:08:39 +0100
To: public-web-perf@w3.org
Message-Id: <65BEE195-09C3-499A-9CA6-68E8AC739E68@geekhood.net>

I'm looking at the API from perspective of an off-line capable web app such as http://app.ft.com. The app needs to collect analytics off-line and some of the events collected are quite important and should not be dropped, even if they cannot be delivered for a long period of time.

The specification only vaguely says UAs should make "best effort" to deliver the beacons, but doesn't elaborate how persistent the messages are and how much effort UAs should put into the delivery.

The spec doesn't say whether beacons should persist across browser restarts, or whether the beacons should be kept for minutes/hours/days or weeks. I think the spec should be clearer how much guarantee there is about beacon delivery to help all implementations be similarly (un)reliable.

I'm concerned that very different persistence/error recovery strategies of implementations will be a source of bias in analytics. If one UA gives up after 3 tries in 3 seconds, but another UA persists beacons across restarts and keeps retrying forever, then this will generate very different picture when beacons are sent while the device is having network difficulties.

The spec doesn't say whether UAs should take precautions against captive portals. There is a risk that when a user is connected to a free Wi-Fi hotspot that redirects all requests to a captive portal, then UAs may end up sending beacons to the portal and wrongly assume they've been successfully delivered.

I think the spec should explicitly advise UAs to detect and avoid submitting to captive portals (e.g. at very least treat 3xx HTTP response as an error). I expect this to be important for web apps used on tablets without built-in 3G — they may be used off-line most of time and sporadically connect to public Wi-Fi hotspots. It'd be very unfortunate if every such connection caused all analytics collected off-line to be lost.

And finally, the spec doesn't give implementors guidance how large the send queue should be. I suggest advising UAs to make the send queue as large as quota for localStorage (or use the shared storage pool), so that authors can assume sendBeacon() is very reliable and they don't have to implement their own backup storage.

To use this API in its current iteration I would have to treat it as another version of XHR or Image.src hack, and need to keep queue messages myself in localStorage and make AJAX requests to verify that the network is usable before giving messages to sendBeacon, but this is obviously an unnecessary burden and may defeat network utilization optimizations that UAs could make otherwise.

OTOH if the spec gave guarantee of persistence across browser restarts, resiliency against prolonged off-line state and captive portals, and very large send queue across all UAs, then the API would be much more useful and I could use it without any wrappers and just fire-and-forget.

--
regards, Kornel

Received on Wednesday, 14 May 2014 20:09:10 UTC