Re: follow up on service workers for publishing platform from Daniel Weck on 2015-12-01 (public-digipub-ig@w3.org from December 2015)

From: Daniel Weck <daniel.weck@gmail.com>
Date: Tue, 1 Dec 2015 15:55:54 +0000
To: "Siegman, Tzviya - Hoboken" <tsiegman@wiley.com>
Cc: "DPUB mailing list (public-digipub-ig@w3.org)" <public-digipub-ig@w3.org>
Message-ID: <CA+FkZ9G=PWuKXW9hEta-h=XttSDcFLcLCOOUxgx7h_GpHgpzVg@mail.gmail.com>

On Tue, Dec 1, 2015 at 1:55 PM, Siegman, Tzviya - Hoboken
<tsiegman@wiley.com> wrote:

> 1.       The current model [2] requires the content to explicitly refer to
> the JS (service worker). (i.e. the scripts modify the content). Is this a
> problem for PWP?

3 trails of thoughts:

1)
The main motivation for isolating the context within which content
gets rendered (typically, iframe / webview sandboxes) is that the
scope of authored document styling, scripting, etc. cannot interfere
with the host presentation logic (e.g. the reading system "chrome",
such as TOC overlay, previous/next navigation for pages, chapters,
etc.).

Also, so-called "fixed layout" or "pre-paginated" digital publications
typically require the capability to lay two contiguous pages as a
visual "synthetic spread", in other words: to present more than one
HTML document at any given time. Side note - some reading systems
preload sibling documents in order to improve perceived page-turn
performance, but that's another topic.

Furthermore, in a decentralized / distributed infrastructure, content
may be served from storage cloud service / HTTP server A, and reading
system may be hosted on HTTP server B (using cross-domain permissions,
HTTP CORS headers for XmlHTTPRequests / Fetch API). Typically, reading
system would fetch content, not vice-versa.

2)
Reading systems usually inject additional behaviour in content
documents (i.e. scripting and styling), in order to implement new
functionality or enhance existing features. This includes: highlighted
selections, annotations, dictionary lookup, popup footnotes,
pagination, synchronized text/audio (TTS self-voicing, or pre-recorded
EPUB3 Media Overlays), MathJax support, DRM protection, etc.

Some content (pre)processing can be done on the server side (or in an
application's backend, using private offline local storage), but
reading systems that execute in vanilla web browsers not only have to
manipulate HTML / CSS DOM at runtime, they also have to handle
encrypted content streams (fonts, audio/video), and perform other
interesting things.

So, from an application architecture perspective, it makes sense to
separate "content" from "presenter of content" using *some degree of
isolation (strictly-speaking, the use of iframe / webview containers
for content documents does not imply a totally hermetic sandbox).

3)
Service Worker is great for offline caching, but also for intercepting
URL requests, generating adequate content payloads (if necessary:
modified documents and attached resources), producing URL responses
that the webview can consume "as-is". Further dynamic manipulations
are typically performed once the DOM is loaded, but the ability to
inject features *before* the content stream even reaches the webview
can be desirable. Without Service Worker, existing reading systems
have to use a number of "tricks" in order to achieve this, such as
full pre-processing of the tree of resources + Blob-URI replacements
(expensive, brittle mechanism, and breaks media streaming
capabilities).

Daniel

Received on Tuesday, 1 December 2015 15:56:43 UTC