Re: Background sync & push messaging: declarative vs imperative from John Mellor on 2014-01-02 (public-webapps@w3.org from January to March 2014)

From: John Mellor <johnme@google.com>
Date: Thu, 2 Jan 2014 17:33:10 +0000
To: Maciej Stachowiak <mjs@apple.com>
Cc: public-webapps <public-webapps@w3.org>, "public-sysapps@w3.org" <public-sysapps@w3.org>, Peter Beverloo <beverloo@google.com>, Michael van Ouwerkerk <mvanouwerkerk@google.com>, Mounir Lamouri <mlamouri@google.com>
Message-ID: <CAG_kaUZyxur4pvfkQ-s_2ccYPER+=bCcW88_m6aLEvmT6HN3Fg@mail.gmail.com>
On Thu, Dec 19, 2013 at 9:32 PM, Maciej Stachowiak <mjs@apple.com> wrote:

>
> On Dec 19, 2013, at 9:02 AM, John Mellor <johnme@google.com> wrote:
>
> [cross-posted to public-webapps and public-sysapps]
>
> A couple of us from Chrome have taken a holistic look at how we could add
> standardized APIs for web apps to execute/sync in the background.
>
> This is an important capability, yet can be safely granted to
> low-privilege web apps, as long as battery consumption and metered data
> usage are (very) carefully limited.
>
>
> Running arbitrary JS in the background, without the relevant page open,
> and with full networking capabilities, does not seem safe to me. How did
> you conclude that this capability can safely be granted to low-privilege
> web apps?
>

Good question (and great examples!). Low-privilege is relative - I did not
intend that any website could use this without asking; instead some
combination of heuristics like the following might be required:

   - user added webapp to their homescreen (or bookmarked it?)
   - user frequently visits webapp domain
   - user accepted permission dialog/infobar

Some of the specific threats I'd be concerned about include:
>
> (1) An attacker could use this capability to spread a botnet that can be
> used to mount DDOS attacks or to perform offline distributed computations
> that are not in the user's interest, without needing a code execution
> exploit or sandbox escape.
>

Some mitigating factors:

   - The same-origin policy would apply as usual, so you'd only be able to
   DDOS hosts that allow CORS (and the apps own host).
   - Both network requests and computations (Bitcoin mining?) would be
   heavily throttled by the browser in order to conserve battery; if we base
   the throttling on how often the webapp is used, and penalise expensive
   background syncs, it would be possible to enforce that total execution time
   in the background is limited to some small multiple (possibly even 1x or
   below) of the total execution time in the foreground.

(2) An attacker could use this capability to widely spread an innocuous
> looking background service which, based on remote command-and-control,
> attempts zero-day exploits against the user's browser at a time of the
> attacker's choosing. This can happen without visiting the malicious site,
> indeed, while a user doesn't have a browser open, and can possibly happen
> faster than even the fastest-updating browser could plausibly patch.
>

Yes, this is a subtle downside of the imperative approach. For frequently
used apps it probably doesn't make a huge difference; conversely if we
stopped syncing completely for apps that haven't been used in a long time,
this concern might be minimized.


> These both seem unsafe to me, compared to the security of the Web as it
> stands today. And I don't think a permissions dialog would be a sufficient
> mitigation, since it is hard to explain the danger to users.
>

On the other hand, native apps can already do both of the above unsafe
activities, and installing them is effectively just a permissions dialog
away. App store review processes may filter out undesirable behavior in the
static parts of apps, but struggle to keep up with dynamically-updating
content, e.g. in embedded WebViews.

Thanks, and best wishes for the new year,
John

Your semi-declarative proposal seems potentially less dangerous, though I
> haven't thought through all of the risks in detail yet.
>
> Regards,
> Maciej
>
>
> This is not a spec proposal - instead I’m looking for feedback on the
> high-level choice between an imperative or declarative set of APIs.
>
>
> USE CASES
>
> 1. Sync when next online
> - Need to send emails & document edits written offline as soon as the
> device goes back online (even if app is not open), and get back sending
> status, or for documents the result of merging the user's diff with any
> changes made by other users.
> e.g. document editing / email client / instant messaging / play-by-mail
> games
>
> 2. Background sync
> - Need to periodically fetch new content in the background, so user will
> have fresh content if they open the app offline.
> - Content often depends on device location (hence requires polling).
> e.g. news / weather / maps / social stream
>
> 3. Large background transfers
> - Need to upload/download large files. App won't remain in the foreground
> for long enough, so this must happen in the background; and even in the
> background, the app is unlikely to be allowed to stay open long enough
> (e.g. iOS 7 limits the execution time of Background Fetch to 30 seconds<http://www.objc.io/issue-5/multitasking.html>),
> so there needs to be a way of handing over large transfers to the
> browser/OS to perform efficiently without the app having to keep running.
> - Ideally such transfers would auto-resume after device reboots.
> e.g. file syncing / movies / video camera
>
> 4. Push notifications
> - Need instant push notification of messages received by the server, or as
> soon as the device comes online if it was offline (even if app is not open).
> e.g. email client / instant messaging / play-by-mail games
>
> 5. Push to sync (tickle)
> - Sometimes a push needs to trigger a sync, for example to fetch email
> attachments.
> e.g. file & document syncing / email client / play-by-mail games
>
> 6. Delayed local notifications
> - Need to show notifications at some time in the future.
> - Various possible interactions with timezone changes and daylight savings
> time.
> e.g. egg timer / alarm clock / calendar
>
> 7. Delayed remote notifications
> - Consider a calendar app: it must reliably show notifications for
> remotely-added events, even if the app hasn’t been opened since the event
> was added. Hence there must be some mechanism for a push message from the
> server to cause a local notification to be set at some time in the future.
> e.g. calendar / cloud synced alarm clock
>
>
> Solving all of these requires a variety of different APIs. But it's
> important that they fit together well, and it turns out that the API design
> choices are closely interrelated.
>
> There are two broad approaches to providing such capabilities on the web:
>
>
> A) IMPERATIVE APPROACH
>
> One approach is to allow JavaScript to be run on demand in the background.
> The prime candidate for this is to extend ServiceWorkers<https://github.com/slightlyoff/ServiceWorker/blob/master/explainer.md>to be a generic mechanism for receiving events in the background (without a
> page having to be open).
>
> 1. Sync when next online
>
> To address use case #1, add an API that requests your ServiceWorker to get
> woken up once in the background, as soon as possible after the device next
> goes online. This could be an extension of the Task Scheduler API<http://www.w3.org/2012/sysapps/web-alarms/>.
> This API should probably only be available when running in the foreground.
>
> navigator.taskScheduler.addOneShot({
>     requireOnline: true
> }, myData);
>
> 2. Background sync
>
> Similarly, to address use case #2, let web apps request their
> ServiceWorker to be periodically run in the background, at an interval of
> the UA’s choice (using aggressive heuristics<http://lists.w3.org/Archives/Public/public-sysapps/2013Nov/0039.html>to choose the most battery/data-efficient moment to run it); though we
> could allow the developer to specify a minimum interval (i.e. if content
> updates daily, you could specify that there’s no point syncing more often).
>
> navigator.taskScheduler.addRepeating({
>     minInterval: "2h"  // UA will pick a >= interval
> }, myData);
>
> We’d probably strictly limit the execution time of the ServiceWorker to
> conserve battery (by comparison, in native apps iOS 7 limits the
> execution time of Background Fetch to 30 seconds<http://www.objc.io/issue-5/multitasking.html>
> ).
>
> 3. Large background transfers
>
> Although we’d be limiting background execution time of the ServiceWorker,
> we can compensate by making it possible to initiate long-running
> up/downloads, managed by the browser/OS.
>
> This could be done by letting you mark async XMLHttpRequests as
> "persistent", somewhat like the Beacon API <http://www.w3.org/TR/beacon/>,
> except that the browser would periodically retry requests if the device was
> offline, and make the results available to the page or ServiceWorker next
> time the web app is launched.
>
> Alternatively we could reuse ServiceWorker’s scriptable Cache objects<https://github.com/slightlyoff/ServiceWorker/blob/master/caching.md>.
> You’d get syntax something like this:
>
> caches.set("movie-cache", new Cache(
>     "http://my-app.com/The%20Lion%20King.mp4",
>     "http://my-app.com/The%20Jungle%20Book.mp4"
> ));
>
> This would cause the UA to download these large files in the background.
> Once all the downloads in the cache are finished, they would show up in
> caches.get("movie-cache").
>
> 4. Push notifications & 5. Push to sync (tickle)
>
> To address use cases #4 and #5, implement something like the Push API<http://www.w3.org/TR/push-api/>,
> allowing servers to remotely wake up a ServiceWorker. The ServiceWorker can
> then sync, show a notification, etc. (For battery life reasons, silent push
> messages probably need to be throttled, in which case we could consider
> adding a never-throttled variant for messages that show a user-visible
> notification).
>
> PUT /push/send/device-c0fa407591 HTTP/1.1
> Host: browser-push-server.com
>
> version=5
>
> 6. Delayed local notifications
>
> To address use case #6, extend the Notifications API<http://notifications.spec.whatwg.org/>so notifications can exist independently of the lifetime of a web page, and
> when activated will wake up the ServiceWorker or web page and fire an event
> on them.
>
> There also needs to be a way to fire a notification after a precise time
> delay; the natural way to do this would be to also allow exact scheduling
> of time sensitive tasks using the Task Scheduler API<http://www.w3.org/2012/sysapps/web-alarms/>,
> but for battery reasons we shouldn’t give web apps carte blanche here;
> instead to prevent abuse we might want to only allow exact scheduling if
> you also show a user-visible notification. So we could either silently
> throttle precisely scheduled tasks if it turns out they’re not showing
> notifications, or we could not implement exact scheduling in Task Scheduler
> at all, and instead extend Notifications to have an optional time delay.
>
> new Notification("Meeting about to begin", {
>     utcTime: 1386011460303,
>     body: "Room 101",
>     launchUrl:  "https://my-app.com/?from=notification"
> });
>
> 7. Delayed remote notifications
>
> By combining the APIs from the previous two sections, you can have a push
> message handler that schedules a delayed notification, satisfying use case
> #7.
>
>
> B) (SEMI-)DECLARATIVE APPROACH
>
> Alternatively, it would be possible to solve almost all these use cases
> without having to run JavaScript in the background, with a combination of
> declarative APIs. This seems to be a new idea, so it warrants more detailed
> explanation.
>
> A key premise of this approach, is that it’s always ok to wake up the
> device/app in order to display a user-visible notification, since these
> provide value to the user (or if they don’t, the user can deal with it by
> revoking the permissions of the web app). But other than that, it’s best to
> limit battery/bandwidth/RAM consumption in the background.
>
> *These are just pseudo-APIs to demonstrate possible capabilities and not
> actual proposals.*
>
> 1. Sync when next online & 3. Large background transfers
>
> For use cases #1 and #3, uploads and downloads need to happen in the
> background, and in a declarative world these would need to be fully managed
> by the browser. As in the imperative approach to use case #3, we could
> reuse ServiceWorker’s scriptable Cache objects<https://github.com/slightlyoff/ServiceWorker/blob/master/caching.md>for background downloads, either from a ServiceWorker, or by exposing them
> to ordinary web pages:
>
> caches.set("movie-cache", new Cache(
>     "http://my-app.com/The%20Lion%20King.mp4",
>     "http://my-app.com/The%20Jungle%20Book.mp4"
> ));
>
> This would cause the UA to download these large files in the background.
> Once all the downloads in the cache are finished, they would show up in
> caches.get("movie-cache").
>
> Similarly background uploads are sometimes necessary. For use case #1
> (e.g. email), it is important that the background upload happen as soon as
> possible once the device goes back online; for use case #3 (e.g. video
> sharing), the background uploads need to support uploading large files,
> however timeliness is less important, and it might be best for the upload
> to only happen over WiFi. It might still be possible for the two to use the
> same syntax; for example we could allow adding<https://github.com/slightlyoff/ServiceWorker/issues/118>a
> Request<https://github.com/slightlyoff/ServiceWorker/blob/062ecbc967e11969adef85fd044a3fab0cdf7e1c/service_worker.ts#L210>object to a Cache instead of just URLs:
>
> caches.set("outbox", new Cache(new Request({
>     method: "POST",
>     url: "http://my-app.com/send-mail",
>     body: my_mime_multipart_message
> }), ...));
>
> The UA would perform these requests in the background, and store the
> response in the cache object. As for urgency, perhaps there could be some
> "urgent_request_by_user" flag that the web developer can set to indicate
> that the user explicitly requested this and intends it to be sent over
> cellular data.
>
> 2. Background sync
>
> The mechanism above is powerful, but sometimes instead of a one-shot
> up/download you need to sync data that regularly updates.
>
> Push can be great for that; but sometimes you have to poll, for example
> weather/map apps syncing forecasts/tiles based on your current location.
> And sometimes (though perhaps rarely) even when you can push it’s more
> efficient to poll smartly - for example if my social network feed has new
> posts several times a minute, I really don’t want it waking up my phone
> radio every few minutes while I’m asleep (especially if I forgot to plug it
> into a charger).
>
> We could extend the caches mechanism above with the ability for the UA to
> periodically check -- in the background -- for updates to the files in the
> cache (using standard HTTP caching logic). The UA could optionally include
> the user’s geolocation in a header when doing so.
>
> caches.set("sync-cache", new Cache(url1, url2, ...));
> caches.requestBackgroundSyncing("sync-cache", {
>     geolocation: true
> });
>
> The client would indicate with cookies or somesuch any user preferences
> about what should be synced. The server could update the cookies to
> indicate how much it has synced, so during the next sync it knows what to
> do.
>
> However this is still quite limiting, as it requires the client to know in
> advance the URLs of the files the server will want it to download. It would
> probably lead to hacks where clients request that a bunch of meaningless
> foo1, foo2, foo3 urls get synced, and the actual content of those urls gets
> rotated server-side. A more flexible approach would add indirection, and
> let the server provide a barebones "manifest" file, that would just be a
> newline-separated list of URLs to sync.
>
> caches.set("sync-cache", new CacheManifest("/sync-manifest"));
> caches.requestBackgroundSyncing("sync-cache", {
>     geolocation: true
> });
>
> The UA would periodically check for updates to the manifest file, fetch
> any new URLs, update existing ones, and delete cache entries for any URLs
> removed from the manifest. Updates would presumably be atomic (sync all
> files in the manifest before updating the cache exposed to the page). I’ve
> glossed over various details, such as what constitutes an update to the
> manifest file, but it should be possible to define something reasonable,
> learning from the lessons of AppCache.
>
> As with the imperative approach, the UA would use various heuristics<http://lists.w3.org/Archives/Public/public-sysapps/2013Nov/0039.html>to determine when is a good time to sync each app, such as batching, screen
> on, wifi available, charging, or even what times of day each web app is
> typically used.
>
> 4. Push notifications
>
> For use case #4, there needs to be a standardized way for an app’s server
> to instantly push user-visible notifications to the device (with automatic
> retry if the device is offline). Interacting with the notification would
> launch the app, at the given URL.
>
> POST /push/send/device-c0fa407591 HTTP/1.1
> Host: browser-push-server.com
> Content-Type: application/json
>
> {
>     "title": "Message from Ben",
>     "body":  "This should arrive instantly :)",
>     "sound": "https://my-app.com/you-got-mail.mp3",
>     "launchUrl":   "https://my-app.com/?from=notification"
> }
>
> 5. Push to sync (tickle)
>
> In the push notification example above, the notification didn’t contain
> any data other than the notification text, and the URL to launch when the
> notification was clicked. It would be reasonable to add a "data" member to
> the JSON, which would somehow be passed to the web page when the
> notification is clicked.
>
> However most push messaging servers have strict size limits on the payload
> (e.g. 4096 bytes). There are many cases, where an app needs to send more
> than this, for example an email client that needs to download the
> attachments that accompany an email, so that if the user clicks the
> notification whilst offline, they will be able to view the attachments.
>
> To handle this case, we can allow a silent push message that kicks off an
> immediate background sync of one or more of the named caches from "2.
> Background sync" above:
>
> POST /push/send/device-c0fa407591 HTTP/1.1
> Host: browser-push-server.com
> Content-Type: application/json
>
> {
>     "updateCaches": ["sync-cache", "attachments-cache"]
> }
>
> 6. Delayed local notifications
>
> For use case #6, there needs to be a way to locally schedule notifications
> with a time delay (since the device might be offline the whole time, you
> can’t rely on push notifications for this). We could extend the Notifications
> API <http://notifications.spec.whatwg.org/> as follows:
>
> new Notification("Meeting about to begin", {
>     utcTime: 1386011460303,
>     body: "Room 101",
>     launchUrl:  "https://my-app.com/?from=notification"
> });
>
> The notification would be delayed until the given moment. A URL is
> provided to launch the app in response to the user clicking on the
> notification, if the app isn’t already running; usually, this URL would be
> available offline due to AppCache or a ServiceWorker<https://github.com/slightlyoff/ServiceWorker/blob/master/explainer.md>.
> An API like Notification.get()<http://notifications.spec.whatwg.org/#dom-notification-get>would let you read and cancel pending notifications.
>
> 7. Delayed remote notifications
>
> A subtle variant of notifications is use case #7. If you add a same day
> event to your cloud calendar with a reminder 10 minutes before the event,
> then the device goes online for a while but the calendar app does not get
> launched by the user, and the device goes back offline for the hours
> leading up to the event, the calendar app needs to be able to fire that
> notification 10 minutes before the event, despite (in this declarative
> model) having been executed neither at the time the push notification
> arrived, nor at any time since then.
>
> Since it doesn’t get executed, such an app can’t locally schedule a
> delayed notification, so instead the push notification API introduced for
> use case #4 could be extended so you can specify a time delay before the
> notification fires (as with delayed local notifications).
>
> POST /push/send/device-c0fa407591 HTTP/1.1
> Host: browser-push-server.com
> Content-Type: application/json
>
> {
>     "utcTime": 1386011460303,
>     "title":   "Meeting about to begin",
>     "body":    "Room 101",
>     "launchUrl":     "https://my-app.com/?from=notification"
> }
>
> We’d need to also support push messages that cancel earlier delayed push
> notifications (e.g. if the event later gets removed from your cloud
> calendar). And the same API that lets you inspect local delayed
> notifications could also read/cancel pushed delayed notifications.
>
>
> CONCLUSION
>
> The imperative approach (A) seems a cleaner set of APIs, from an extensible
> web <http://extensiblewebmanifesto.org/> point of view.
>
> However whenever a ServiceWorker (or equivalent) runs in the background,
> it requires a full JS interpreter, and all the associated browser machinery
> to support APIs like Geolocation, which together consumes a significant
> amount of RAM. This can be a problem on mobile; for example on Windows
> Phone, background agents are restricted to 11 MB of RAM<http://msdn.microsoft.com/en-us/library/windowsphone/develop/hh202942(v=vs.105).aspx#BKMK_ConstraintsforallScheduledTaskTypes>on devices with less than 1GB, otherwise 20 MB, and are terminated if they
> exceed this limit. This is presumably in order to ensure that the
> foreground app (and other important processes) remain responsive (and don’t
> get evicted). The limits on Android and iOS are vaguer and
> device-dependent, but increased background RAM usage could potentially be a
> deal-killer for the imperative approach.
>
> (The UA can tightly control battery and data usage in both the imperative
> and declarative approaches described above, which is why I’m not focusing
> on those here).
>
> We’re currently thinking of prototyping the imperative approach. But it
> seems that the 2 main capabilities introduced incrementally in section B
> (declarative push message actions, and the caches mechanisms) could provide
> a viable plan B if the imperative approach turns out to use too much RAM.
>
> *Please don’t bikeshed the syntaxes yet*; at this stage the main open
> questions are:
>
>
>    1. Is this a reasonable vision for the set of imperative capabilities?
>    2. Would such declarative capabilities be sufficient to address all
>    important use cases?
>    3. How easy would web developers find developing against such
>    declarative APIs, compared to the imperative approach? The server would
>    play a slightly greater role in driving the sync logic; but that may not be
>    so terrible.
>    4. How much RAM would such a declarative approach ultimately save? Is
>    it worth it?
>
>
> --John
>
>
>
Received on Thursday, 2 January 2014 17:33:58 UTC