Re: Options for same-origin definition for Navigation Timing from Jonas Sicking on 2010-11-08 (public-web-perf@w3.org from November 2010)

From: Jonas Sicking <jonas@sicking.cc>
Date: Mon, 8 Nov 2010 00:19:10 -0800
To: Nic Jansma <njansma@microsoft.com>
Cc: "public-web-perf@w3.org" <public-web-perf@w3.org>
Message-ID: <AANLkTi=1jp2BU076tZVZziW7c3R=euN3TLc1fv1mfev5@mail.gmail.com>
That was definitely a lot of detail :)

So I think we should keep it simple, and say that anything that isn't
the same scheme+host+port tripple is different origins, no matter if
document.domain or CORS are involved and configured in various ways.
Here are the reasons why:

As for CORS, it's been designed to have a different meaning. If a
server sends access-control-allow-origin: * for a given page, that has
the meaning that other sites can read the information *on that uri*.
Actually, as long as the header above is the only header sent, it even
only means that other sites can read that uri if they don't send
cookies[1] in the request. So no user private information is ever
leaked. But even if we involve other CORS related headers, it still
only involves that one uri.

That one uri might not ever contain privacy sensitive information. All
that might come from other sub-resources being loaded, or even just
from information stored in localStorage or in cookies.

We could do something even more complicated and ensure that all sub
resources were loaded with appropriate CORS headers, but at this point
things get seriously complicated, and we still haven't covered the
localStorage scenario.


Regarding document.domain, involving that I'm partially opposed on
principle. document.domain is a extremely poor API and is giving us
grief in many ways as it is. In a recent discussion regarding setting
up a server for running automated tests for HTML5 at W3C, the decision
was that we really want to move the testing server off of a DNS name
ending in w3.org completely. This was due to document.domain issues.
Further, at least in chrome and in firefox, document.domain is causing
performance issues every time pages use it. And in firefox we might
end up having to drop optimizations even on pages that aren't using
it.

So in short, I don't want to give people more incentive to use this feature.


Which leaves us with the "private domain" solution. At least in
firefox we consider foo.bar.com to be completely separate from
hello.bar.com. The only time when that isn't true is around
document.domain, and for cookies. But for all other intents and
purposes they are as separate as sell.com is from buy.com. Without
some sort of opt-in mechanism here we'd not want to leak data cross
sites here.


So to sum it up, none of the mechanisms discussed so far would be ok
for us to break down the same-origin barrier. I could possibly see
creating a new API of some sort (javascript API or http-header), but
I'd strongly prefer to keep it simple for now and not do anything like
that.

/ Jonas

[1] It's more than cookies which aren't sent with CORS unless the
access-control-allow-credentials header is involved. Stuff like auth
headers and private certs are also not used.


On Thu, Nov 4, 2010 at 3:11 PM, Nic Jansma <njansma@microsoft.com> wrote:
> Hi everyone,
>
>
>
> We have several options for how we can define same-origin for the purposes
> of the NavigationTiming interface.  I have listed out 4 of these options,
> along with their advantages/disadvantages and examples below.
>
>
>
> Summary:
>
> ·         #2 (document.domain) and #4 (CORS) have implementation
> complexities (in the user agent), developer/webserver configuration
> requirements (HTTP headers or JavaScript), and security challenges, and thus
> we are leaning away from using them for our definition of same-origin.
>
>
>
> ·         #1 (FQDN) and #3 (private domain) are easy to implement and do not
> have any developer/server configuration requirements.  #1 is more
> restrictive than #3.
>
>
>
> With this in mind, we would recommend #1 or #3 for NavigationTiming.
>
>
>
> Here are the advantages and disadvantages of the 4 options we see:
>
>
>
> 1.       As specified in the HTML5 Editor's Draft section 5.3 for origin:
> http://dev.w3.org/html5/spec/origin-0.html#origin
>
> ·         Two pages have the same origin if the tuple (scheme, host, port)
> is the same.
>
>                                                                i.      host,
> in this definition, is the full host name AKA fully qualified domain name
> (FQDN)
>
> ·         Advantages:
>
>                                                                i.
> Simple implementation
>
>                                                              ii.      The
> same-origin comparison is only needed to be done at the time of navigation
> and upon redirects
>
> ·         Disadvantages:
>
>                                                                i.      Most
> restrictive: sibling domains and sub-domains cannot share timing data via
> the NavigationTiming interface (a.com and secure.a.com and www.a.com are all
> different)
>
> 1.       A web-developer can work around this as explained in workaround1
>
> ·         How a Web Developer would use this: The developer would not have
> to do anything to get this behavior, and can work around it for domains they
> control via cookies in a workaround1
>
> ·         Examples:
>
>                                                                i.      a.com
> -> a.com: same origin
>
>                                                              ii.      a.com
> -> b.com: different origin
>
>                                                             iii.
> www.a.com -> a.com: different origin
>
>                                                            iv.
> secure.a.com -> www.a.com: different origin
>
>                                                              v.
> secure.a.com -> login.a.com -> (301) secure.a.com: different origin
>
>                                                            vi.
> secure.a.com -> secure.a.com -> (301) secure.a.com: same origin
>
>
>
> 2.       As specified in the HTML5 Editor's Draft section 5.3 for origin:
> http://dev.w3.org/html5/spec/origin-0.html#origin, and allowing for section
> 5.3.1 relaxing of the same-origin restriction via setting document.domain
>
> ·         Two pages have the same origin if the tuple (scheme, domain, port)
> are the same.
>
>                                                                i.
> domain, in this definition, is the FQDN, but can be relaxed by setting
> document.domain as in section 5.3.1.
>
> ·         Advantages:
>
>                                                                i.      Site
> owners could relax the security restrictions for their subdomains as they
> desire via document.domain
>
> ·         Disadvantages:
>
>                                                                i.
> Complex implementation.  The current page needs to keep track of the
> previous page's origin and document.domain.  The interface's attributes need
> to be updated when changing the current page's document.domain to compare to
> the previous page's origin and document.domain.
>
>                                                              ii.      Cannot
> be used for server redirections (301, 302), as the HTTP headers cannot set
> the document.domain.  This would break the case of secure.a.com ->
>  login.a.com -> (301) www.a.com, even if secure.a.com and www.a.com both set
> their document.domain to a.com
>
>                                                             iii.      Using
> the document.domain for the purposes of NavigationTiming affects the
> effective script origin for a page, which has security consequences not
> related to NavigationTiming.
>
> ·         How a Web Developer would use this: Setting document.domain in the
> previous/current pages via JavaScript (relatively easy)
>
> ·         Examples:
>
>                                                                i.      a.com
> -> a.com: same origin
>
>                                                              ii.      a.com
> -> b.com: different origin (and a.com cannot set its document.domain to
> b.com)
>
>                                                             iii.
> secure.a.com -> www.a.com: different origin
>
>                                                            iv.
> secure.a.com -> www.a.com (where secure.a.com sets its document.domain to
> a.com): different origin
>
>                                                              v.
> secure.a.com -> a.com (where secure.a.com sets its document.domain to
> a.com): same origin
>
>                                                            vi.
> secure.a.com -> www.a.com (where www.a.com sets its document.domain to
> a.com): different origin
>
>                                                           vii.
> secure.a.com -> www.a.com (where secure.a.com and www.a.com set their
> document.domain to a.com): same origin
>
>                                                         viii.
> secure.a.com -> login.a.com -> (301) www.a.com (where secure.a.com and
> www.a.com set their document.domain to a.com): different origin
>
>
>
> 3.       Using private domain as the origin
>
> ·         Two pages have the same origin if the tuple (scheme, private
> domain, port) are the same.
>
>                                                                i.
> private domain, in this definition, would be defined from as Eric Lawrence
> explains here: Understanding Domain Names in Internet Explorer
>
> 1.       Which is similar to Mozilla Foundation's Public Suffix list
> http://publicsuffix.org/, + 1 label
>
> ·         Advantages:
>
>                                                                i.
> Automatically works for all subdomains under the private domain without the
> page having to allow it via JavaScript or HTTP headers
>
> ·         Disadvantages:
>
>                                                                i.
> Loosest definition of same-origin.
>
>                                                              ii.
> Current public suffix list implementations are either heuristics or based on
> maintained lists, which cannot be quickly updated, and are not perfect.
>
>                                                             iii.      Would
> allow different-owner sites of subdomains such as a.blogspot.com to get data
> about b.blogspot.com, with no way for b.blogspot.com to restrict access.
>
> ·         How a Web Developer would use this: The developer would not have
> to do anything to get this behavior
>
> ·         Examples:
>
>                                                                i.      a.com
> -> a.com: same origin
>
>                                                              ii.      a.com
> -> b.com: different origin
>
>                                                             iii.
> a.fed.us -> b.fed.us: different origin (fed.us is a public suffix)
>
>                                                            iv.
> z.a.fed.us -> x.a.fed.us: same origin (fed.us is a public suffix)
>
>                                                              v.
> secure.a.com -> a.com: same origin
>
>                                                            vi.
> x.y.z.a.b.c.a.com -> a.com: same origin
>
>                                                           vii.
> secure.a.com -> www.a.com: same origin
>
>                                                         viii.
> secure.a.com -> login.a.com -> (301) www.a.com: same origin
>
>
>
> 4.       As specified in the HTML5 Editor's Draft section 5.3 for origin:
> http://dev.w3.org/html5/spec/origin-0.html#origin, relaxed by CORS
> http://www.w3.org/TR/cors/
>
> ·         Two pages have the same origin if the tuple (scheme, host, port)
> are the same, which can be relaxed via CORS (eg.
> Access-Control-Allow-Origin)
>
>                                                                i.      host,
> in this definition, is the full host name AKA fully qualified domain name
> (FQDN)
>
> ·         Advantages:
>
>                                                                i.      Site
> owners could relax the security restrictions for their subdomains as they
> desire via CORS
>
>                                                              ii.      Might
> be able to use it for server redirections (301, 302), as the HTTP headers
> can set the allowable domains.
>
> ·         Disadvantages:
>
>                                                                i.      CORS
> is in Working Draft (http://www.w3.org/TR/cors/)
>
>                                                              ii.      Not
> all browsers fully support CORS for all scenarios
> (http://en.wikipedia.org/wiki/Cross-Origin_Resource_Sharing)
>
>                                                             iii.      CORS
> was intended for resources from a page, not for page-to-page navigations
>
> 1.       Any first page that wanted to allow NavigationTiming info for a
> destination page would have to allow it via CORS'
> Access-Control-Allow-Origin: http://destination.com HTTP header when the
> first page is generated.  The first page would have to know all of its
> potential exit-pages and specify them or use *.
>
> 2.       Using CORS in this way expands the security exposure for root HTML
> pages more than CORS originally intended
>
>                                                            iv.      A more
> complex implementation, especially in the cases of redirections (as HTTP
> headers would have to match for all paths to the page).
>
> ·         How a Web Developer would use this: Configuring the web server or
> content generator to add HTTP headers (relatively complex and not always
> available)
>
> ·         Examples:
>
>                                                                i.      a.com
> -> a.com: same origin
>
>                                                              ii.      a.com
> -> b.com (where a.com does not specify Access-Control-Allow-Origin):
> different origin
>
>                                                             iii.      a.com
> -> b.com (where a.com specifies Access-Control-Allow-Origin: http://b.com):
> same origin
>
>                                                            iv.
> secure.a.com -> www.a.com: different origin
>
>                                                              v.
> secure.a.com -> www.a.com (where secure.a.com specifies
> Access-Control-Allow-Origin: http://www.a.com): same origin
>
>                                                            vi.
> secure.a.com -> www.a.com (where www.a.com specifies
> Access-Control-Allow-Origin: http://secure.a.com): different origin
> (destination page cannot specify the header, only the previous page)
>
>                                                           vii.
> secure.a.com -> login.a.com -> (301) www.a.com (where secure.a.com specifies
> Access-Control-Allow-Origin: http://www.a.com and login.a.com does not):
> different origin
>
>                                                         viii.
> secure.a.com -> login.a.com -> (301) www.a.com (where secure.a.com and
> login.a.com specify Access-Control-Allow-Origin: http://www.a.com): same
> origin
>
>
>
> Workaround1: If we go with #1/#3, sites may still be able to get good (but
> not perfect) timing data since they control all domains in the navigation
> path.  For example, they could use cookies to track the navigation start
> time.
>
> Sorry for the verbosity but I think this level of detail is needed to get a
> good picture of all the options.
>
> We would love to hear your feedback!
>
>
>
> - Nic
Received on Monday, 8 November 2010 08:20:08 UTC