Re: Request for a TAG review of the Presentation API from mark a. foltz on 2015-07-28 (www-tag@w3.org from July 2015)

From: mark a. foltz <mfoltz@google.com>
Date: Mon, 27 Jul 2015 22:46:25 -0700
To: Francois Daoust <fd@w3.org>
Cc: Anne van Kesteren <annevk@annevk.nl>, TAG <www-tag@w3.org>, "public-secondscreen@w3.org" <public-secondscreen@w3.org>
Message-ID: <CALgg+HEQebB=vR74FBWYf83+Zd29SweWKRFQv+jhGPAzGF2Npw@mail.gmail.com>
On Wed, Jul 1, 2015 at 6:53 AM, Francois Daoust <fd@w3.org> wrote:

> On 2015-07-01 13:17, Anne van Kesteren wrote:
>
>> On Wed, Jul 1, 2015 at 11:55 AM, Francois Daoust <fd@w3.org> wrote:
>>
>>> 1. Security requirements for the messaging channel
>>> -----
>>> The Presentation API is agnostic of the protocol used for the messaging
>>> channel as long as it is capable of carrying DOMString payloads in a
>>> reliable and in-order fashion. A user agent could perhaps communicate
>>> with
>>> the second device using the WebSockets protocol or a WebRTC data channel.
>>>
>>
>> How can you leave this undefined? That would mean you don't have
>> interoperability across user agents and users would need to get all
>> their products from one vendor.
>>
>
> I should probably have spelled out that issue more explicitly.
> Interoperability is certainly a problem that all group participants have in
> mind. The group wants to remain agnostic of the underlying protocols as
> much as possible, but experience gathered once first prototype
> implementations are out will show to what extent that is wishful thinking.
>
> Even if the specification ends up mandating support for specific discovery
> and communication protocols, it would still make sense to allow user agents
> to support additional ones. How can we formulate security requirements for
> such cases.


I haven't found an existing spec that would provide an a straightforward
definition for this use case.  Perhaps others with more experience in the
W3C have ideas.

The closest definition of "secure context" I could find stems from the Web
Security Context TR [1], which defines it specifically as all resources
retrieved over a "strongly TLS protected" connection, which in turn is
defined in terms of the existing server-based PKI for TLS.

Another alternative is the "trustworthy origin" as defined in [2].  But,
that is constructed in terms of UA/server interaction around an origin, and
we don't assume that a presentation display has an origin.  We could
fabricate a unique per-display origin, but that doesn't really help the
authentication situation.

 However, when the controlling page is loaded in a secure context, the spec
>>> should set some guarantees of message confidentiality and authenticity
>>> ("only secure WebSockets"). Do you have suggestions on ways to specify
>>> security requirements in a generic manner?
>>>
>>
>> This seems hard since typically devices don't have a DNS name for
>> which you could issue a certificate.
>>
>
> Indeed. Isn't it possible in the WebRTC/RTCWeb world to establish an
> encrypted data channel between two such peers without authentication?


Yes, via DTLS [3].  DTLS alone prevents passive attacks (eavesdropping) but
not active (man-in-the-middle) attacks.  To authenticate the two parties,
the RTCWeb group has proposed a security architecture [4] that relies on
calling out to a third party identity provider to verify key fingerprints
generated by the DTLS handshake.

I think if we wanted to leverage this work, we would have to write the spec
in a way that applied it across the various combinations of raw
TCP/WebSockets/RTCDataChannel and TLS/DTLS that might be used to establish
the communication channel.   At the end of the day I don't know it's the
job of the Presentation API specification to mandate network level
protocols at that level though.

My inclination is to abstract some properties away from [1] and [2] and say
something normative about the properties of the communciation channel,
using [3] as a concrete example of a secure implementation.


>>  2. Private mode browsing for the presenting context
>>> -----
>>> While the controlling device will be a "private" device, the presenting
>>> device will often be a "shared" device, perhaps a TV set or HDMI dongle
>>> in a
>>> household, or a remote screen in a meeting room. To protect the
>>> controlling
>>> user's privacy, the group would like to require the presenting user
>>> agent to
>>> load the presentation URL in private mode.
>>>
>>
>> How would this work for games? Games typically have large assets we
>> would not want to load anew each time you play. It would be pretty
>> disastrous if each time you want to do some gaming you have to wait a
>> couple of hours for all the assets to load on your TV.
>>
>
> This would not work for games and that has indeed been raised in the past
> (for reference, see similar discussion from last year, which includes
> points about UX for games but also in other situations on the mailing-list
> of the Second Screen CG that gave birth to the WG at [1]). However, current
> implementers, Google and Mozilla in particular, will load presentations in
> private mode browsing.
>

I don't have any objections to exposing a non-empty cache to the presenting
browsing context, so that large assets can be cached across presentations.


>
> Francois.
>
> [1]
> https://lists.w3.org/Archives/Public/public-webscreens/2014Aug/0012.html
>
>
[1] http://www.w3.org/TR/wsc-ui/#def-strong-tls
[2] http://www.w3.org/TR/powerful-features/#is-origin-trustworthy
[3] https://tools.ietf.org/html/rfc4347
[4] https://tools.ietf.org/html/draft-ietf-rtcweb-security-arch-11
Received on Tuesday, 28 July 2015 05:47:15 UTC