[Bug 19028] New: Support a rel attribute that restricts cookie transmission

https://www.w3.org/Bugs/Public/show_bug.cgi?id=19028

           Summary: Support a rel attribute that restricts cookie
                    transmission
           Product: HTML WG
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P3
         Component: HTML5 spec
        AssignedTo: dave.null@w3.org
        ReportedBy: contributor@whatwg.org
         QAContact: public-html-bugzilla@w3.org
                CC: ian@hixie.ch, mike@w3.org,
                    public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org, julian.reschke@gmx.de,
                    annevk@annevk.nl, ayg@aryeh.name,
                    w3c@getify.myspamkiller.com


This was was cloned from bug 11235 as part of operation LATER convergence.
Originally filed: 2010-11-05 13:15:00 +0000
Original reporter: Alexander Romanovich <alex@sirensclef.com>

================================================================================
 #0   Alexander Romanovich                            2010-11-05 13:15:16 +0000 
--------------------------------------------------------------------------------
One of the objectives of a CDN involves using a cookie-less domain to cut down
on the amount of data that needs to be transferred to the server to make a
request. I'm in a situation where I cannot use a cookie-less subdomain because
of a need to use ".foo.com" for cookies so that states persist across a
non-fixed set of subdomains on the host. For this reason I have a number of
subresources that transmit cookie data needlessly on any given page.

It would seem useful to allow a "rel" attribute on script/link/img tags that
tells the browser to not send cookies with that subrequest. This would make it
possible and very easy for developers to cut down on the amount of cookie data
being sent, especially in situations when they lack the authority or have a
specific obstacle to creating a cookie-less CDN to serve the resources over.
================================================================================
 #1   Kyle Simpson                                    2010-11-05 14:00:36 +0000 
--------------------------------------------------------------------------------
+1. I think this is a fantastic idea for improved web performance optimization.
================================================================================
 #2   Julian Reschke                                  2010-11-05 14:04:04 +0000 
--------------------------------------------------------------------------------
Potential overlap with "noreferrer"?
================================================================================
 #3   Alexander Romanovich                            2010-11-05 14:18:51 +0000 
--------------------------------------------------------------------------------
I thought of adding this behavior to an existing rel value, but then the
nomenclature would be a little misleading (since cookies and referrers are not
related).

That said, the general idea of bundling this behavior with other ones triggered
by the same attribute would be fine, so long as they're all things that a
developer would want together, and not individually.
================================================================================
 #4   Anne                                            2010-11-08 09:10:59 +0000 
--------------------------------------------------------------------------------
Maybe we should have something like "anonymous" (similar to AnonXMLHttpRequest)
which kills credentials, Referer, and Origin.
================================================================================
 #5   Julian Reschke                                  2010-11-08 09:19:58 +0000 
--------------------------------------------------------------------------------
Indeed.

It would be great if it could *replace* noreferrer, but that ship probably has
sailed.
================================================================================
 #6   Alexander Romanovich                            2010-11-08 14:20:47 +0000 
--------------------------------------------------------------------------------
A rel="anonymous" would probably fit the bill perfectly (restricting cookies,
HTTP auth, SSL certs, referrer, and origin). (Though according to this source,
the origin header should only sent with script requests of the 3 types of
requests I originally mentioned: https://wiki.mozilla.org/Security/Origin)

I'm in the CMS business, and I'm thinking here of all the content we generate
(particularly image thumbnails for individual news stories, etc. which would
not be appropriate to make into sprites). Our product typically drives pretty
large web sites, and the ability to use this flag globally in page output would
probably have a dramatic effect across the board. Removing credentials and
extra headers from these requests is an improvement, and would become an asset
for security as well.
================================================================================
 #7   Kyle Simpson                                    2010-11-10 13:53:25 +0000 
--------------------------------------------------------------------------------
I've definitely been in favor of this proposal, especially the suppressing of
cookies.

I ran it by Billy Hoffman (http://zoompf.com) and he brought up a good point
that we need to consider.

There are apparently some servers/applications that are intentionally
configured to log out a user session if a request is received that has no
cookies. Honestly, I'm not actually sure how that would work, because I'm not
sure how the server knows which session to kill if there was no cookie to
identify to the server who the request came from. But, nevertheless, apparently
this is a reality out there.

So, the obvious point is, anyone who used such a functionality in their
application (for whatever reason, intentional or not), they couldn't use this
rel="anonymous" to suppress cookies, without logging out users.

On the surface, my reaction was to say that such strange setups would just be
unable to use this rel feature.

But Billy pointed out that such things can be used in a DoS attack. For
instance, evil.com can have an <img> tag on it that points to an image on
bank.com, and uses rel="anonymous" to force the user to be logged out. Now, in
my opinion, this type of DoS is rather benign, but I guess it's real
nonetheless.

So, this is what I propose:

We restrict the behavior of rel=anonymous to only work (at least in terms of
cookies) if the resource is on the same domain (exactly) as the page domain. It
would be silently ignored for requests to resources on other domains.

This should be fine for CDN usage, because CDN's in general are not sending out
cookies. Or, rather, the issue we're trying to solve is much more about all the
global cookies that are set on a local domain (like analytics tracking cookies,
etc) that are unnecessarily bogging down static resource requests. So, the far
majority of those requests will be to the same page-domain, which would benefit
from the rel=anonymous behavior being discussed.

Thoughts?

--Kyle
================================================================================
 #8   Anne                                            2010-11-10 14:04:07 +0000 
--------------------------------------------------------------------------------
If that is a real problem that would be a problem with XMLHttpRequest as well.
Could you raise that on public-webapps@w3.org?
================================================================================
 #9   Kyle Simpson                                    2010-11-10 14:38:57 +0000 
--------------------------------------------------------------------------------
I think the mitigation of XHR is that normal XHR only works same-domain, and
even CORS requires the server to handle the pre-flight authorization before a
real cross-domain request can come in, whereas <script>, <link>, <img> etc can
all freely make cross-domain requests.

Nevertheless, I'll check out that list. Do I need to join that WG before I can
post?
================================================================================
 #10  Anne                                            2010-11-10 14:47:24 +0000 
--------------------------------------------------------------------------------
No need to join. And you are wrong as to how cross-origin XMLHttpRequest
operates. It makes the request directly for simple GET requests. And this
already works in Firefox/Safari/Chrome. And also works in Internet Explorer
when using XDomainRequest. So I sort of think that vulnerability is already
inherent to the platform.
================================================================================
 #11  Ian 'Hixie' Hickson                             2010-12-29 08:37:27 +0000 
--------------------------------------------------------------------------------
I agree that the problem described is a real one: that images, scripts, and
style sheets are often served from separate domains to avoid sending cookies
and that doing so is hard in some cases such as that described in comment 0.
However, rel="" can't fix this  since neither images nor scripts have a rel=""
attribute. It would have to be something like a "nocookie" attribute or some
such.

The usual solution is to just use an entirely separate domain (e.g. yimg.com).

Is this really common enough to warrant new syntax features in HTML?
================================================================================
 #12  Julian Reschke                                  2010-12-29 08:42:05 +0000 
--------------------------------------------------------------------------------
(In reply to comment #11)
> I agree that the problem described is a real one: that images, scripts, and
> style sheets are often served from separate domains to avoid sending cookies
> and that doing so is hard in some cases such as that described in comment 0.
> However, rel="" can't fix this  since neither images nor scripts have a rel=""
> attribute. It would have to be something like a "nocookie" attribute or some
> such.
> ...

*If* we decided to add new attributes, we of course *could* add @rel to img and
script.
================================================================================
 #13  Alexander Romanovich                            2010-12-29 17:29:32 +0000 
--------------------------------------------------------------------------------
I can't comment on how common usage of an attribute like this would be across
the web in general, but I can briefly describe the effect I would anticipate
from such a feature in the content management world, which is what prompted me
to file this request.

I work on a large scale CMS in education. Looking at a typical client, there
are images (most frequently thumbnails accompanying news stories, event
listings, galleries, etc.) that appear often in list format on a large number
of pages throughout any given site. It is not a situation where yimg would be
desirable, as there are several reasons why our custom image
manipulation/deployment tools are used by our clients on their local servers as
opposed to remote hosting, and other reasons why these particular clients may
not wish to use third party remote hosting to store their image media in
general. Using Google Analytics alone, which is a distinct commonality, you're
looking at a large chunk of cookie data being transmitted to the server per
image request. (The same is true for script and link tags which the CMS embeds
in templates.) These are high traffic sites, and the CMS often powers more than
one site running on the same machine (I have one client that has chosen to
drive six sites with the CMS on one server).

In other words: frequent usage of script, link, and especially image tags
throughout a web site, multiplied by at least the large chunk of Google
Analytics cookie data per unique request, multiplied by the volume of users of
a high traffic site, multiplied by the number of sites on that server,
translates to a lot of time spent receiving data from the end user, which as
you know is much slower than sending data to them. Since it would be trivial to
modify our codebase in order to apply a new attribute to content we generate,
it would be an equally trivial process to cut bandwidth usage and server
response time for users of our CMS, across the board. I would imagine adoption
by Wordpress, etc. would result in a similar story, as well as sites which
simply display a large quantity of resources. The benefit here is that it
provides a painless option to reduce bandwidth usage and resource load latency
when the alternatives (which have been noted here) are either impossible or
incompatible, or simply inconvenient.
================================================================================
 #14  Ian 'Hixie' Hickson                             2011-01-23 20:30:55 +0000 
--------------------------------------------------------------------------------
It's not clear to me that HTML is the right place for a solution for this. For
instance, whatever solution we used you'd still want to be able to control
whether images or fonts in a style sheet had cookies or not, you'd still want
to be able to control what happened to resources imported by scripts (including
those without underlying DOM nodes, like Worker and EventSource objects), and
you'd want it to cover a whole raft of features even just within HTML, like
<video src>, <source>, <object>, <iframe>, etc.

My recommendation at this stage would be to approach browser vendors and
encourage them to experiment with different ideas for addressing this, so that
we gain implementation experience.
================================================================================
 #15  Alexander Romanovich                            2011-01-23 21:24:53 +0000 
--------------------------------------------------------------------------------
I guess you're right that an HTML attribute would be too limiting in regards to
controlling this functionality in all aspects of web browsing. But where would
such an approach, with the larger scope you're describing, be implemented
exactly?

I assume you're not talking about browser settings, since this is something a
web developer would want to control on a case-by-case basis (since some
requests require cookie transmission to maintain logins, for example.). I
suppose you could send a header along with the main resource that instructs the
browser how to behave in respect to different kinds of future subresource
requests on the page, but that would get tricky if it was the sole source of
instruction about so many different types of requests. Otherwise, I could
imagine HTML being the right place for controlling this so long as there's an
equivalent solution(s) that applied to the additional cases you mentioned (i.e.
a request in the generic sense knows how to be anonymous, but the HTML
attribute is just one of several ways to switch that flag on).

I'd be happy to approach some browser vendors, once I'm clearer on where the
possibilities lie for implementing this, if not just an HTML attribute.
================================================================================
 #16  Kyle Simpson                                    2011-01-23 23:49:32 +0000 
--------------------------------------------------------------------------------
I agree that more thought would need to go into all the possible resource
requests which would benefit from this type of functionality.

But it still seems like having `rel` as an attribute for all those different
containers, with a value that said "suppress cookies", would be sufficient. If
any of those containers don't yet support the `rel` attribute, it wouldn't seem
too onerous to extend rel to those containers.

Even if we're discussing sub-requests (like a script loads more scripts), those
requests are always done via dynamically creating one of the containers in
question, in which case setting the `rel` property should suffice, right?

I think the only other concern would be if XHR requests should support a way to
suppress in the same way, and I think that it should. I'm not sure `rel` for
XHR would make much sense, but perhaps something like "sendCookies" or
whatever.
================================================================================
 #17  Ian 'Hixie' Hickson                             2011-02-16 08:58:15 +0000 
--------------------------------------------------------------------------------
I don't really know what a good solution would be, unfortunately. Maybe some
sort of API or declarative solution where you can whitelist URL prefixes that
don't get cookies and so on?

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the tracker issue; or you may create a tracker issue
yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Partially Accepted
Change Description: none yet
Rationale: For administrative purposes I'm going to mark this one "LATER" until
we have more implementation experience.
================================================================================

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Tuesday, 25 September 2012 21:56:03 UTC