Google/Mozilla Presentation API update

Hello,

Recently we (Mark and Anton) had a chance to meet with representatives from
Mozilla (Jonas, Marco, and Wesley) about the current state of the
Presentation API. We discussed the following ideas, which we believe will
help improve the specification that came out of the Community Group and
make it suitable for the use cases we have in mind.


Below we refer to the “controlling” page as the one that calls
requestSession and the “presenting” page as the one that is rendering the
presented URL.

1.  Reconnection to existing presentations on the controlling user agent.

After considering the options, a method to reconnect using a mechanism
similar to shared workers [1] whereby presenting to a URL already being
presented would reconnect to that presentation.

It would improve security and ease of use of the API for requestSession to
take a secure (hard to guess) identifier in conjunction with the URL that
would be generated by the browser and returned with a new
PresentationSession.  This way, the initiating page can (1) control who can
reconnect by sharing that identifier and (2) allow the same URL to
effectively start a new presentation by requesting it without the
identifier.  E.g., the API would look like:

partial interface NavigatorPresentation : EventTarget {
  PresentationSession requestSession(DOMString url,

                                    optional DOMString presentationId);

}

partial interface PresentationSession : EventTarget {
  readonly DOMString presentationId;
};

In some circumstances the page may wish to only reconnect to an existing
presentation and not start a new one when the page is loaded.  For example,
if their browser crashes or a tab is accidentally closed, it will want to
see if the user was presenting before that happened.  In this case the
initiating page would call requestSession with a previously stored
identifier, and if the presentation is no longer active, it would
immediately get a disconnected session.  It can then choose to initiate a
new session by requesting again without the identifier.

It also solves the problem of disconnection, i.e. the site can erase the
presentation id from its cookies or local storage with the identifier when
the user logs out, and other pages with access to the same storage would no
longer be able to access the presentation.

An alternative proposal allows the application to pass in its own
presentation identifier and determine whether the presentation request
should only reconnect, or potentially create a new presentation:

partial interface NavigatorPresentation : EventTarget {
  PresentationSession requestSession(DOMString url,

                                    optional DOMString presentationId,

                                    optional boolean onlyReconnect);

}

[1]
http://www.w3.org/TR/workers/#shared-workers-and-the-sharedworker-interface

2. Cross page navigation on the presenting user agent.

If the page hosting the presentation navigates what happens to the
presentation session? Does the navigated-to-page automatically get a
PresentEvent or is some access control mechanism necessary?  One option is
to have the PresentationSession implement Transferable [2] so it can be
transferred to a Shared Worker on the presentation side, similar to how a
MessagePort can be transferred between pages and workers.

[2]
http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#transferable-objects

As a secondary aspect an API that exposes the PresentationSession as a
property of NavigatorPresentation would be easier for Web developers, as
they don’t have to race against the browser firing the onpresent event too
soon, before their event handler is registered.

3. Message passing API.

The current spec only allows DOMString to be passed via the messaging API.
 We would like to allow efficient transfer of binary payloads via this API;
this would allow the presenting page to e.g. stream a bundle of resources
needed for the presentation to the presenting page, or to share a locally
created binary data stream with the presenting page (e..g, from MSE [4]).

To this end, aligning the messaging API with the one exposed by
RTCDataChannel makes sense [3], specifically the send(), close(), and
onmessage parts of the API.

[3] http://dev.w3.org/2011/webrtc/editor/webrtc.html#rtcdatachannel

[4] http://www.w3.org/TR/media-source/

We don’t want to tie the Presentation API spec to the RTCDataChannel
implementation, but would like there to be interface compatibility, as
WebRTC would be a likely mechanism for implementing peer-to-peer
communication between user agents.

4. User agent context for rendering the presentation.

If we intend the same presentation content to be rendered either in the
same user agent or a remote user agent, we need to carefully define the
rendering context so that the application doesn’t get different behavior
according to whether it is rendered remotely or locally.  In particular the
presentation rendering context must have:

- No access to cookies, local storage or IndexedDB instances

- No access to HTTP cache

- No access to pre-existing SharedWorkers

- Extensions are debatable - some may be required for e.g. VPN or firewalls
to work correctly

For performance we would still like to be able to pre-load resources needed
by the presenting page and to share them with (or stream them to) it.  Thus
a need for transferring binary blobs efficiently between the two sides.

5. Remoting <video> presentation remains an important use case for both
mobile and desktop.  We would like the behavior of the <video> tag to be
defined in the remote presentation scenario, and offer hooks for web
developers to customize the behavior of the tag when it is being remoted.
 This might be done as an addendum to the main specification.

6. Some standard for establishing communication between two UAs would be
very helpful for interoperability for the 2-UA case.  At a minimum, a
standard serialization and framing of messages that are sent between the
two sides.  This may not be a specification that is pursued directly within
the group but may happen alongside.

7. A standard way to use the API to take advantage of the millions of
existing devices that can render a subset of Web content via DIAL or other
mechanisms.  Again, this is not something that may happen in the CG/WG but
may be done separately, for example as part of the DIAL specification. [5]

[5] http://www.dial-multiscreen.org/

We hope to contribute concrete specification updates to address 1-4 as well
as example code to better illustrate these use cases, depending on
developer time :)

Thanks all, and we welcome feedback.

Mark, Jonas, Anton, et al.

Received on Thursday, 14 August 2014 00:06:44 UTC