Zero-edits Counter Proposal for Issues 1 and 2 (Ping) from Tab Atkins Jr. on 2010-02-15 (public-html@w3.org from February 2010)

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Mon, 15 Feb 2010 09:48:28 -0600
To: public-html@w3.org
Message-ID: <dd0fbad1002150748s30af0888j909af4cd6ed0c4d0@mail.gmail.com>
Summary
=======
This is a counter-proposal for Issues 1 and 2.  The current change
proposal recommends removing the @ping attribute and associated
sections from the spec.  I argue that @ping is a potentially useful
feature with many positive attributes, and should be maintained.  In
this Change Proposal I will enumerate some of these positive
attributes, and directly address some of the complaints found in the
original Change Propoal
(http://lists.w3.org/Archives/Public/public-html/2009Dec/0183.html).
This change proposal is culled partially from the htmlwg wiki page at
http://www.w3.org/html/wg/wiki/AddedAttributePing.


Rationale
=========

Why This Attribute Should Be Added
----------------------------------
1. Ping provides added functionality. It allows the user agent to
inform users which URIs are going to be pinged as well as giving
privacy-conscious users a way to turn it off. (Current auditing
methods are difficult or impossible to get information out of or turn
off.)

2. The user reaches the destination without waiting for tracking URLs
to be pinged and regardless of whether pinging the tracking URLs is
actually successful. (Current auditing methods typically direct the
user to a tracking url first, which then redirects the user to the
actual page.  This produces a visible delay in navigation and
introduces an additional point of failure for the url.)

3. The POST method is used for @ping, consistent with its semantics of
being non-idempotent.  UAs that should not trigger non-idempotent
changes, such as web spiders, can choose to avoid doing so. (Current
auditing methods typically use a normal GET method in the form of a
navigation to a tracking url.  Web spiders cannot avoid activating the
tracker without avoiding the link entirely.)

4. Allows for multiple URLs to be pinged. (Current auditing methods
ping a single URL.  They can conceivably string together multiple
redirects to ping several urls, but that degrades the user experience
even further and makes the entire thing substantially more fragile.)

5. Allows the UA to indicate links to destinations that have been
already-visited even if the tracking URLs are different. (Current
auditing methods often change the url to embed unique tracking
information, which defeats the ability of a UA to mark visited links
appropriately for the user.)

Counter-Arguments to Arguments For Why This Attribute Should Not Be Added
-------------------------------------------------------------------------
> The feature is half-baked, insufficiently implemented, and
> therefore not yet suitable for standardization.
While I argue that "half-baked" is incorrect, "insufficiently
implemented" describes the vast majority of the spec *currently*, and
it not a reason to remove anything.  That reasoning only becomes
relevant when one has reason to believe that the feature will *never*
be sufficiently implemented (for example, if a major browser went on
record as refusing to implement it).

> The ping feature was added to HTML based on speculation that
> an optional mechanism would be usable instead of the typical
> redirect, javascript, or gateway-based tracking mechanisms.
> However, it cannot be used reliably until all browsers have
> implemented ping, are deployed, and do not configure it "off"
> by default.  Sites would therefore either ignore the ping
> feature until all of the browsers turn it on or use it only
> for secondary counts, thus duplicating the traffic that already
> handles this functionality.
Again, this is true of most features in HTML5.  Only a lucky few are
useful before wide implementation.  As implementations arrive, authors
will begin to use the feature.

> Ping would never be capable of proving undercounts [the sole
> apparent reason for this new feature] because there is no
> guarantee that the two DNS requests will deliver equally
> reachable servers for the ping and href, nor that the href
> request will succeed before the ping succeeds, nor that the
> href URL corresponds to the ping-per-referral URL.  It is for
> all of those reasons that people use redirects, referer,
> javascript, and cookies today to do tracking and those will
> never be solved by ping.
I'm not entirely certain what this objection is concerning, but I
believe it's stating that since @ping might be unreliable if some
parts of the request chain fail, people will refuse to use it.  Many
of these potential failure scenarios apply to *all* auditing methods,
however, and ping solves them more gracefully (for example, if the
ping server fails while using a redirection-based auditing method, the
link itself becomes useless; using @ping would just mean that no pings
were recorded while the server was down).

As well, some of the methods listed as being in use today and implied
as continuing to be the primary method in the future are unreliable
even today.  Referer tracking and cookie-based tracking can be shut
down relatively easy by a savvy user, though for the most part they
are harmless and so are left on.

> Also, as described in ISSUE-1, ping's use of POST causes an
> unsafe method to be used in response to a safe activation request,
> in violation of the method constraints that have been part of
> Web architecture since 1992.
POST is the correct method to use to reflect @ping's semantics.

> The actions generated by a user agent should be consistent
> with the actions selected by the user.  That is why TimBL had an axiom
> about GET being safe -- clicking on a link (or a spider wandering
> around) must be translated into a safe network action because to do
> otherwise would require every user to know the purpose of every
> resource before the GET.  It follows, therefore, that the UI for a
> user action that is safe (a link) must be rendered differently from
> all other actions that might be unsafe.
While this may theoretically be a good rule to follow, CSS has long
since made this issue moot.  A <form method=post><input type=submit
value=foo></form> can be rendered identically to <a href="">foo</a>.
It was in some ways moot from the start, as there is no visible
difference between a form using GET and one using POST, until you
attempt to refresh the page the form lands you on.

That said, it is perfectly possible and perhaps appropriate for a UA
to indicate the @ping targets to the user.  Note that current auditing
methods make this effectively impossible.

> The discussion on ping assumes that the ping target is expecting
> to receive a POST request with the content "PING" (i.e., that the
> target has not been deliberately supplied to fool an unsuspecting
> user into triggering an unsafe action when they select the link).
> That is an invalid assumption -- the target of the ping could be
> any URL, including those that do fun things like delete wiki pages
> or print documents or send mail ... we've been through this all
> before, and not all unsafe resources even read the body before
> taking an action on behalf of the user.  That's why HTTP and HTML
> both have requirements on use of safe methods.
Badly configured servers are always a threat.  @ping does not increase
the attack surface here.  This attack can be trivially performed with
only existing tools and be essentially invisible to the user.  One
could simply use any normal network tool to send a POST request at the
server.  One could also embed a hidden <iframe> in a page with a <form
method=post> in it, and use javascript to submit the form.  This would
allow one to distribute the attack across all visitors to the page.

I believe that most pages are not badly configured enough to fail when
POSTed with nothing more than "PING".

(Content below comes from the wiki page, not the original Change Proposal.)

> Most of the current script based tracking solutions add details
> about the UA such as window size and current page title. If @ping
> doesn't support this there is a risk of this attribute not being used.
There is a risk here, but it seems like something we can address when
we see actual usage.  At a later point, if necessary, we can either
discuss adding such details or dropping the feature.

> A script based solution can add the tracking behaviour to all links
> on a page. How is @ping supposed to work for links added in a WYSIWYG
> editor where the content editor is unaware of the concept of "tracking"?
WYSIWYG editors will have to change to support automatically adding
@ping to pages.  Many HTML5 features are not supported by current
WYSIWYG authoring tools, though, so this is not a sufficient reason to
drop the feature.

> Relying on it for tracking while some UAs do not support it will lead
> to inaccurate tracking
Authors can either wait for a sufficiently large degree of support, or
supplement this with javascript-based tracking in legacy UAs.

> It can be disabled, which will lead to inaccurate tracking
Many existing auditing methods, such as Referer tracking or cookies,
can also be disabled by the user.  In practice, however, most users
leave them on, and so they are commonly relied on for tracking
purposes.  I see no particular reason why @ping should be any
different; it's even less of a privacy risk.

> By adding a @ping attribute pointing to your web site to every link
> (navigation or otherwise), I can cause unnecessary load on your server
> in a way that is invisible to the casual end user
This method of user-initiated DDoSing is already available in easier
and more effective methods simply by inserting <img> elements into a
page.  @ping does not increase the attack surface.

> POST is an unsafe HTTP method. UAs should not invoke unsafe methods
> without the user's consent. If at all, please use a safe method (GET/HEAD...)
> instead.
As stated previously, while this might be a valid action
*theoretically*, in practice this has *never* been true.  A form using
GET and one using POST are perfectly identical to the end-user unless
they attempt to refresh the requested page, and CSS allows an <a> and
a <form method=post> to be styled identically.  The UA can, of course,
expose the @ping information to the user, perhaps in a method similar
to how the target of the link is currently exposed.

As well, POST is clearly the correct method for @ping's semantics.


Details
=======
No edits to the draft.


Impact
======
Authors gain the ability to use @ping, which is a useful feature with
many advantages over current auditing methods.


~TJ
Received on Monday, 15 February 2010 15:49:17 UTC