Re: action-334, issue-112, a summary on sub-domains for exceptions from David Singer on 2013-01-03 (public-tracking@w3.org from January 2013)

From: David Singer <singer@apple.com>
Date: Thu, 03 Jan 2013 15:51:26 -0800
To: public-tracking@w3.org
Message-id: <0D2CE5D1-8EDF-4D01-8676-0A89163A9BC5@apple.com>
This thread went dormant without much of a conclusion in November, as I perceive.  The issue is around the use of wild-cards in the exception API.

There are two places that host-names occur in the APIs:

* the 'implicit' parameter, the site making the call, and that will become the first of the two host-names in the remembered record [top-level, target]
* the explicit 'target' parameter for site-exceptions.

Wild-carding the second is easily handled; we already allow the request to be for the entire web ("*"), and even if it is not, we allow the user-agent to make it so.  So allowing the explicit parameters to include wild-cards (e.g. *.adservice.net) is clearly harmless, as it's more restricted than a plain "*" which it could be converted to.


We're left with the problem of the implicit parameter.  What issues come up?

A.  Some parties have reasonably large numbers of hostnames/sites. Sometimes they are related in name, sometimes not.  Movie studios, for example, often create a new site for each movie they release (e.g. http://www.skyfall-movie.com/site/ or http://yimg.com, as well as http://developer.apple.com). This list of sites is sometimes dynamic (changes over time).

B.  We don't want to allow a site to register an exception for a "public suffix", and thereby grant an exception to unrelated parties.  For example, if someone asked for a site exception for anything embedded on *.com, then huge numbers of unrelated parties would be getting an exception.

C.  We don't want to have to check the public suffix database (http://publicsuffix.org, which is huge and unwieldy) at all if possible, and at most on the API call and not when headers are sent.

D.  We don't really want to do a fetch on the "same-party" array at the time of the call, and we cannot possibly fetch it each time we generate an HTTP header.

E.  We have to watch where the wild-card asterisk goes; for example, with ICANN generating TLDs like water, we don't want to have yahoo.* registered, or we'll run into the same "unrelated parties" problem as before. It's not clear who would register such a mistake, however ("cui bono?") but a lack of motive doesn't mean we should allow such an obvious mistake.


Here are some possibilities:

1 allow the APIs to indicate a top-level domain which has the form *.<rest>, where *.<rest> must match the domain making the call (the document origin of the script), and <rest> must not be a public suffix.  That allows scripts.google.com to supply a script that asks for an exception for *.google.com.

2 allow the APIs to ask for the exception for "myself" (the document origin of the script) "and all my same-parties too" (a fetch at API time of the same-party array).

3 say that the document-origin of the script should be a site with a short hostname, and allow the exception to apply to sites with that as a suffix (e.g. make the call from google.com, and then the exception applies to *.google.com). That avoids the public suffix issue, but not the unrelated site-names issue.

4  combine 3 with 2, and say that if blah.com is declared as a same-party, then the exception applies to *.blah.com.


I cannot see a way to avoid having sites that dynamically create unrelated site-names (e.g. the skyfall site above) from calling the API again to apply to that site.  There's no way we can do the check of same-party dynamically, from the user-agent.

 

David Singer
Multimedia and Software Standards, Apple Inc.
Received on Thursday, 3 January 2013 23:51:56 UTC