a/@ping discussion (ISSUE-1 and ISSUE-2), was: An HTML language specification vs. a browser specification from Julian Reschke on 2008-11-22 (public-html@w3.org from November 2008)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sat, 22 Nov 2008 12:39:03 +0100
To: Ian Hickson <ian@hixie.ch>
CC: "Roy T. Fielding" <fielding@gbiv.com>, HTML WG <public-html@w3.org>
Message-ID: <4927EF57.8070105@gmx.de>
Ian Hickson wrote:
> ...
>> I don't care how long ping has been under consideration by WHATWG 
>> mailing lists, nor do I care how many fanboys have thought in the past 
>> that it is worth implementing. It represents a change to HTML (a harmful 
>> one at that). Place it on the block and let it fight for itself in terms 
>> of implementation. It should be a separate proposal until it has been 
>> successfully implemented by two independent implementations. Likewise 
>> for all of the other new additions.
> 
> This is certainly an interesting way of writing specifications, but it's 
> not how the W3C (or the IETF) has worked so far. Why would we start with 

Sorry? At least for the IETF it's tricky to make broad statements like 
this. But rest assured that at least in some Working Groups that 
*revise* a specification, new things do *not* get added; they get 
separate proposals, and if they succeed, potentially get included into 
the main spec.

> HTML5? Would it help if you consider HTML5 spec as it stands today to _be_ 
> the separate proposal?

Separate to what?

 > ...
>> [It] would never be implemented consistently in practice
> 
> Could you elaborate on this? Why would it not? It seems simple to 
> implement, and the spec is pretty detailed.

"When the ping attribute is present, user agents should clearly indicate 
to the user that following the hyperlink will also cause secondary 
requests to be sent in the background, possibly including listing the 
actual target URLs." -- 
<http://www.w3.org/html/wg/html5/#hyperlink-auditing>

I would expect that for implementations to be consistent, there would 
need to be a proposal *somewhere* how to implement this particular aspect.

>> [It] is trivial to defeat
> 
> That's by design. The whole point is to protect user privacy for users 
> who desire pings to be disabled.

Speaking of which: I can easily imagine that in certain countries, laws 
will require this feature to be turned off by default.

>> [It] is trivial to use for a DoS attack or mass fraud on the referral 
>> provider
> 
> Surely it's easier to use <img> for a DOS attack than ping="".

img/@src causes GET requests, while a/@ping causes POST requests.

> I don't understand how it would be easier to use for fraud than, say, 
> redirects, which in practice are what is used today.

Redirects cause GET requests.

>> [It] is completely redundant to the current features provided by HTTP 
>> (cookies and referer) and HTML (any embedded request).
> 
> I don't understand how you would implement click tracking with any of 
> those. Could you provide code examples should how one would translate the 
> following to an existing mechanism other than redirects? Found on, say, 
> example.net:
> 
>    <a href="http://example.com/" ping="http://example.org/">...</a>

This introduces an additional party (example.org) to the operation. Why 
is this needed?

>> >From http://lists.w3.org/Archives/Public/public-html/2008Feb/0145.html:
>>
>> I see no actual implementations
> 
> Mozilla has an implementation, but it was disabled due to last minute 
> changes to the spec.

"The spec has changed under us so we're now at a state of no UI, no good
proposal for a UI, and no compliant implementation. Let's just disable 
it and see if we can get a good implementation for next firefox 
instead." -- <https://bugzilla.mozilla.org/show_bug.cgi?id=415168>

So the change of the spec was only one of three reasons being stated.

>> [I see] an overwhelming number of comments that indicate it isn't 
>> desirable in HTML.
> 
> Volume of comments one way or the other is not a technical argument.

But volume of comments can be an indicator of whether something has 
consensus or not.

>> >From http://lists.w3.org/Archives/Public/public-html/2007Nov/0101.html:
>>
>> Not really.  The actions generated by a user agent should be consistent 
>> with the actions selected by the user.  That is why TimBL had an axiom 
>> about GET being safe -- clicking on a link (or a spider wandering 
>> around) must be translated into a safe network action because to do 
>> otherwise would require every user to know the purpose of every resource 
>> before the GET.  It follows, therefore, that the UI for a user action 
>> that is safe (a link) must be rendered differently from all other 
>> actions that might be unsafe.
>>
>> In short, if the UI is being presented as a normal link, then the HTTP 
>> methods resulting from the user's selection must all be safe 
>> (GET/HEAD/OPTIONS).  You can argue this one for the next few years if 
>> you like, but I'd be shocked if the TAG let anything else progress past 
>> the WD stage.
>>
>> I don't care how many user agents already get it wrong today. They are 
>> responsible for their own implementations.  We are responsible for the 
>> standards by which those implementations will be judged broken and 
>> liable for that broken behavior.
>>
>> The discussion on ping assumes that the ping target is expecting to 
>> receive empty-body POST requests (i.e., that the target has not been 
>> deliberately supplied to fool an unsuspecting user into triggering a 
>> non-safe action when they select the link).  But that is an invalid 
>> assumption -- the target of the ping could be any URI, including those 
>> that do fun things like delete wiki pages or print documents or send 
>> mail ... we've been through this all before and not all of them require 
>> bodies.  That's why HTTP and HTML both have requirements on use of safe 
>> methods.
> 
> Browsers should show that ping="" will cause a side-effect, that's pretty 
> much the whole point of the attribute. This is in line with what RFC 2616 
> says to do for unsafe methods -- tell the user.

Following a hyperlink needs to *stay* a safe operation.

Link auditing itself (whether desirable or not) is a safe operation from 
the p.o.v. of the user, and this is what's relevant here.

> A ping is non-idempotent, too, so we can't use GET.

I think you misunderstand the concept of idempotence, which could be 
caused by RFC 2616 being misleading.

See discussion in <http://trac.tools.ietf.org/wg/httpbis/trac/ticket/27>.

> Also, note that sites vulnerable to ping="" cross-site would already be 
> vulnerable to numerous CSRF attacks, so that's not an argument against 
> ping="" using POST.

Actually, it is. Just because there's already another way to cause harm 
doesn't mean it's ok to add another one.

And yes, sites need to handle arbitrary requests anyway; if they don't, 
they are buggy. But that's not an excuse for adding a way to create POST 
requests by simply navigating a web site.

>> I am well aware of how link tracking works and the entire history of the 
>> user tracking industry in Web protocols (due to a recent patent case), 
>> and you haven't even reached the most minimal requirements that a real 
>> site would need for tracking referrals, and would never be capable of 
>> proving undercounts [the sole apparent reason for this new feature] 
>> because there is no guarantee that the two DNS requests will deliver 
>> equally reachable servers for the ping and href, nor that the href 
>> request will succeed before the ping succeeds, nor that the href URI 
>> corresponds to the ping-per-referral URI.
> 
> There have been several groups that have said that ping="" is exactly what 
> they need, including (but not limited to) two groups at Google, which is a 
> pretty major player in click tracking. I'm sure it doesn't address 
> everyone's needs, and if there are changes that can be made to improve the 
> feature so that it covers even more groups, we certainly should consider 
> them, but saying that the feature doesn't match any real site's needs is 
> simply not true.

That may all be true. I can't verify it. That being said, there seems to 
be disagreement about it, and it would be good if that discussion would 
occur in the open, and we wouldn't have to take your word for it.

> ...
> This concludes all the substantial feedback from you regarding ping="" 
> that I could find. Did I miss something?
> ...

To summarize:

- it's not clear that it will be used
- the way it's implemented over HTTP is problematic
- there's no proposal for a UI that would comply with the requirements 
in the spec

So this is a very good example for a part of HTML5 that clearly is not 
stable, nor has consensus, and also could *easily* be specified separately.

> ...

Best regards, Julian
Received on Saturday, 22 November 2008 11:39:46 UTC