Re: CHANGE PROPOSAL: Remove ping and hyperlink auditing (ISSUE-1 and ISSUE-2) from Roy T. Fielding on 2009-12-08 (public-html@w3.org from December 2009)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Mon, 7 Dec 2009 18:40:41 -0800
To: Kornel Lesinski <kornel@geekhood.net>
Cc: "public-html@w3.org" <public-html@w3.org>
Message-Id: <D962C0F5-4788-46CF-848F-D3AFDC4BD468@gbiv.com>
On Dec 7, 2009, at 3:32 PM, Kornel Lesinski wrote:

> On Sun, 06 Dec 2009 01:47:22 -0000, Roy T. Fielding <fielding@gbiv.com> wrote:
> 
>> Regarding ISSUE-1 (PINGPOST) and ISSUE-2 (PINGUI), this is a
>> formal change proposal to remove the ping attribute and all
>> mention of hyperlink auditing from the HTML5 specification.
>> The feature is half-baked, insufficiently implemented, and
>> therefore not yet suitable for standardization.
> 
> It was thoroughly discussed, had experimental implementation and has been redesigned in response to feedback. That's about the same as most HTML 5 features.

You must mean new features, right?  Because the vast majority
of HTML is quite old.  I would not mind the removal of every
new HTML5 feature that has had less positive deployment
experience than ping.  Feel free to point those out.

However, the reason I spent my time on a change proposal for
the removal of ping is because it isn't just a UI feature --
it's design calls for a waste of network bandwidth and its
deployment would be actively harmful to Web sites.

>> Hyperlink auditing is important because advertising and
>> referral-based user tracking are two of the primary means of
>> generating revenue via Web sites.  However, by its very nature,
>> such tracking must be comprehensive, accurate, and unavoidable
>> by a typical user or it simply won't be relied upon by site
>> owners and advertisers.  The ping feature is incapable of
>> providing such accuracy.
> 
> That's a good point. It would be nice to get feedback from advertisers about this.
> 
> However ad tracking isn't the only use-case for ping. Search engines don't need perfectly reliable tracking of clicks on results, but need tracking to be unobtrusive. Similarly e-commerce sites may want to track which page elements are used most. Reddit.com has nice feature of marking links as visited on server-side, so they're synchronized between computers, and ping could be used for that too.

All of those needs are satisfied by existing, already deployed
technologies, far more accurately than they would ever be
with ping.

> E-mail newsletters usually track all links. It would be great if they could track clicks without obscuring URLs (letting anti-spam filters see all links).

It would be great if they didn't track links at all.
Presumably they have some reason for doing so, and I
am quite certain that reason does not include "oh,
but we'll be satisfied with only a few tracks even
though we could have all of them."

>> The ping feature was added to HTML based on speculation that
>> an optional mechanism would be usable instead of the typical
>> redirect, javascript, or gateway-based tracking mechanisms.
>> However, it cannot be used reliably until all browsers have
>> implemented ping, are deployed, and do not configure it "off"
>> by default.  Sites would therefore either ignore the ping
>> feature until all of the browsers turn it on or use it only
>> for secondary counts, thus duplicating the traffic that already
>> handles this functionality.
> 
> Yes, there will be lengthy adoption period, but should we just give up now because of this?

Yes.  Zero deployment and no benefits obtained from deployment
is not an adoption period -- it is a waste of everyone's time
and bandwidth.  It has been two years since the issue was raised.

> Trackers similar to Google Analytics' TrackEvent API could use ping where available (set it via DOM instead of creating an image). It's also possible to do other way round, and emulate ping with JavaScript (AJAX, cookies or local storage + images) where there's no native implementation.

Which is why there is no need for ping in HTML5.  It is already
satisfied by deployed solutions.

> The important part is that browsers may make ping requests most efficient and least disruptive for users as possible (e.g. multiple requests buffered and then pipelined, and might even be sent with lowest QoS).
> 
> Perhaps ping should be exposed as JavaScript API as well? It might be easier for current trackers to use and JavaScript page counters could use it as well.

That would certainly make it easier to generate fraud pings.

>> Ping would never be capable of proving undercounts [the sole
>> apparent reason for this new feature] because there is no
>> guarantee that the two DNS requests will deliver equally
>> reachable servers for the ping and href, nor that the href
>> request will succeed before the ping succeeds, nor that the
>> href URL corresponds to the ping-per-referral URL.  It is for
>> all of those reasons that people use redirects, referer,
>> javascript, and cookies today to do tracking and those will
>> never be solved by ping.
> 
> With redirects you don't have guarantee that final destination will be reachable either. It's not uncommon for redirects to go through several parties, and each of them could break the chain.

If the chain breaks on redirects then the user doesn't reach
the final destination.  It isn't ideal, but it is easy to
detect and usually gets fixed in a hurry.

For hyperlink auditing, if the href goes through but the ping
fails, then the user might be happy or might not -- the server
is not adequately informed of the referral, and thus may supply
content which has not been appropriately refined for the user
(e.g., missing things like special pricing, group discounts,
or specific ad-campaign-related content).  If the ping goes
through but the href fails, then the server is faced with an
overcount problem that is not discernible from fraud.

> Ping gets users to landing page quicker and with higher probability.

That is not a relevant concern for people who choose to track.
Go ahead and ask them if they would be willing to trade
50% of their tracking-derived revenue for 300ms of initial-page
response time?  How about 25%?  10%?  Find out what their
threshold is and then get back to the WG with a real number.

>> Also, as described in ISSUE-1, ping's use of POST causes an
>> unsafe method to be used in response to a safe activation request,
>> in violation of the method constraints that have been part of
>> Web architecture since 1992.
> 
> Is current widespread practice of using GET specifically for triggering side-effects better? If so, then ping can be made to use that too.

Of course it is better.  HEAD works as well.

>> In short, if the UI is being presented as a normal link, then the
>> HTTP methods resulting from the user's selection must all be safe
>> (GET/HEAD/OPTIONS/etc.).  While some user agents may already fail
>> to protect the user in that regard, that is not an excuse to add
>> another broken feature to the standard.
> 
> I think it is. There's no point worrying about weak link in a chain that is already broken in few places (form.submit() or submit button with CSS/type=image are far more dangerous).

So, fix them.

> Today servers/applications without CSRF protection cannot be considered safe.

Bah!  CSRF is only a problem because clients suck at protection.
Making them suck more does nobody any good.

> Switching to "safe" method may make very little difference in practice. Web applications that don't need body of POST at all are very likely to be exploitable via GET requests as well (e.g. PHP using register_globals misfeature).

They might be, but at least those vulnerabilities are already
defined as being a violation of the relevant standard.  Those
server-side implementations will suck just as much with ping as
they would without it in HTML, whereas the ones that actually
do check the method (as required by HTTP) will be newly broken
by ping.

> The spec could say that non-graphical UAs that don't support JavaScript are not allowed to support ping either, and AFAIK there will be no new vulnerability.

And no value-add either, since a ping can already be done in
javascript.

> ping is easier to block than other CSRF exploits. Given that it's a new feature with very specific purpose, servers could even be shipped with ping requests blocked by default.

I fail to see how that would help anyone.

>> That would also solve ISSUE-2 (PINGUI), which the past two years have
>> demonstrated that implementing a preferences UI should at least
>> be figured out before it is demanded of all implementations.
> 
> Disabling of ping can be coupled with disabling of Referer header and 3rd party cookies. They are similar and figured out.
> 
> Ping may be disabled when browsing in "private" mode (definition of this mode varies between browsers, so not all may choose to do it) or by browser extensions like TORButton and NoScript (both block it already!)
> 
> I don't think ping needs more than that. It's certainly unreasonable to expect that users will manually check every tracking URL before clicking on any link. Auditing of tracking URLs has to be done automatically, and ping allows that.
> 
> There are already nastier tracking methods in widespread use that lack UI. To compete with them ping probably even _shouldn't_ have UI. Anyway, it gives browsers full control and tracking URLs, so browsers have a lot of freedom with UI implementation, if any is ever needed.

I think the point was that it hasn't been implemented, so
how can we expect to reach worldwide agreement on it as a
standard?  Standards are supposed to be agreements on how
everyone should implement, not speculation on how someone
might implement.

If ping has a future, then define it as an extension,
implement it as an extension, deploy it as an extension,
and finally adopt it as a standard for everyone else.

....Roy
Received on Tuesday, 8 December 2009 02:41:12 UTC