[Bug 27269] Normatively require distinctive identifiers to be different by top-level and EME-using origin from bugzilla@jessica.w3.org on 2015-01-15 (public-html-bugzilla@w3.org from January 2015)

From: <bugzilla@jessica.w3.org>
Date: Thu, 15 Jan 2015 09:41:59 +0000
To: public-html-bugzilla@w3.org
Message-ID: <bug-27269-2486-zQed4Wiu2c@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27269

--- Comment #8 from Henri Sivonen <hsivonen@hsivonen.fi> ---
(In reply to David Dorwin from comment #6)
> (In reply to Henri Sivonen from comment #5)
> > (In reply to David Dorwin from comment #4)
> > > While using a combination of origins may address the concern, there are
> > > potential problems:
> > > 1. Other storage mechanisms (cookies, etc.) are not unique per combination.
> > 
> > Well, cookies are so broken that they aren't even clamped to an origin!
> > Still, the Web Platform has been able to introduce other things that have
> > origin-based security.
> 
> Yes, origin-based, but not (origin x origin)-based.

True. However, arguing against starting now, because of the lack of precedent
is an argument for never starting. This is basically the arguing against
encrypting DNS queries because SNI is in the clear and then arguing against
encrypting SNI because DNS queries are in the clear problem.

Also, failing to partition the identity as requested here increases the
trackability enabled by DRM considerably. Even if requiring https turned out to
be a success, which would rule out active MITMs injecting iframes that trigger
EME to see a device identifier, the https requirement wouldn't prevent ad,
analytics or video hosting networks from elicing a cross-site tracking
identifier in an iframe.

The partitioning by top-level origin is essential for avoiding a situation
where EME-based DRM becomes a cross-site tracking vector.

> > Do you have examples?
> 
> Some hypotheticals:
> 
> 1) Suppose there is some service that provides protected content services
> for other websites. Maybe the user somehow has a relationship with that
> service. With the proposal in this bug, that service would see different
> distinctive identifiers when hosted on example.com, foo.com, and foobar.com.
> If the service limits the number of "devices" a user can use in some period
> of time, the user would unknowingly use up three "devices". (Note that this
> can also happen if identifiers are cleared or as a result of using private
> browsing modes *if* the user agent allows distinctive identifiers in such a
> mode.)

For video hosting as a service, where the hosting service isn't a user-facing
brand, this shouldn't be a problem. How would you even communicate usefully to
users that their device limit on the site whose ToS they are reading (haha,
user reading the ToS) is counted together with other sites that use the same
faceless hosting service as an implementation detail?

>From the perspective of the users being able to understand what limits they are
subject to, having the limits be counted based on a technical and business
detail of the hosting service instead of having them counted based on the
user-facing site identity is just bizarre.

As for hosting services that have a user-facing brand, that seems to pretty
much boil down to YouTube and Vimeo, but they also mainly host content that's
not of the DRM-requiring kind. It rather weird to require DRM but then allow
random third parties to embed that content. I'm sure it's possible to show an
example where someone would want to impose DRM, allow embedding and insist that
the DRM be couple with a device limit counted across all the embedders, but I
think we should prioritize user privacy over such a business case, which isn't
the business case that motivates EME in the first place. (The motivating case,
i.e. movie streaming services that work on their own domains and don't allow
embedding aren't affected by this privacy measure. [Unless they practice
unnecessary host name proliferation per below.])

> This could potentially be a problem even for a standalone site. For example,
> www.example.com and browse.example.com might both host an iframe for
> player.example.com.

If that hurts, don't do that then.

> 2) Data, such as a persistent license, stored from offline.example.com would
> not be accessible and/or appear invalid from www.example.com even if they
> both iframe player.example.com. The user could lose that license and
> potentially the ability to get a new one if the website architecture changed
> or the user can't figure out that they need to go back to
> offline.example.com.

So don't have a separate offline hostname but make the main things offlineable.

(In reply to Jerry Smith from comment #7)
> I've been concerned about David's hypothetical case 1 as well.  Services
> that host across a number of websites would need to tolerate large numbers
> of end user devices for a given user account, since the identifier returned
> would be different for each.  These services though have a business interest
> in limiting the number of devices allowed.  The proposed privacy mitigation
> discussed in this bug effectively undercuts the ability to do this, and it
> seems fundamental to the proposal.

Why should this business interest be considered by the W3C more important that
the privacy of users?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Thursday, 15 January 2015 09:42:01 UTC