Re: Feedback on the ping="" attribute (ISSUE-1) from Adam van den Hoven on 2007-11-07 (public-html@w3.org from November 2007)

From: Adam van den Hoven <adam.vandenhoven@gmail.com>
Date: Wed, 7 Nov 2007 08:56:48 -0800
To: HTML WG List <public-html@w3.org>
Cc: Mark Baker <distobj@acm.org>, Julian Reschke <julian.reschke@gmx.de>, Ian Hickson <ian@hixie.ch>
Message-Id: <417F0EE5-262B-490B-B4FA-ECE9E0A7EF43@gmail.com>
On 6-Nov-07, at 2:11 PM, Ian Hickson wrote:
> On Tue, 6 Nov 2007, Mark Baker wrote:
>> On 11/6/07, Ian Hickson <ian@hixie.ch> wrote:
>>> On Tue, 6 Nov 2007, Mark Baker wrote:
>>>>>
>>>>> In that case I don't understand what we are discussing. Could you
>>>>> define the terms in more detail?
>>>>
>>>> Are there any specific terms you had in mind?  I think we all  
>>>> understand
>>>> what "safe" means.
>>>
>>> I don't think I do, since the way you are using it doesn't match  
>>> what I
>>> understand of the term.
>>
>> Ok.  The closest thing to a definition that Roy's cited, AFAIK, can  
>> be
>> found here;
>>
>> http://lists.w3.org/Archives/Public/www-tag/2002Apr/0207.html
>>
>> But again, the word can be used in many contexts, including both the
>> contexts that are of importance here: message and implementation.  So
>> I'm not sure that will help.
>
> The above e-mail seems to imply that a message in HTTP is "safe" if it
> causes no loss of property for the user (with a loose definition of
> property here).
>
> By that definition, any method would be safe for the purposes of
> processing ping="", because the semantics of the message are simply  
> that
> the user agent is notifying the server of an action.
>
> Note, though, that it seems that the danger is not in doing "safe"  
> things
> with "unsafe" methods, but with doing "unsafe" things with "safe"  
> methods.
> That is, doing "unsafe" work (work which can cause loss of property)  
> is
> bad when you're using GET or HEAD -- but doing "safe" work as part  
> of a
> message with "POST" is harmless. As far as I can tell.
>
> On the other hand, if we agree that "idempotent" means "has no  
> important
> side-effects" (for anyone), then clearly ping="" is not idempotent,  
> and
> so we have to have a non-idempotent method.
>
> Does that make sense?

It does to me, but I'm starting to see why this is causing so much  
discussion.

Perhaps its worth looking at it from a different point of view.

This attribute is meant to satisfy the need for analytics while  
providing users with (potentially) the ability to control whether or  
not their interactions are tracked. A user who "chooses" to have their  
interactions tracked (by not disabling) could then be said to have a  
need to have their interactions tracked. The URI "www.example.com/linktracking/link20923/ 
" which would be the resource that represents a specific link. If I  
were to GET that resource, I would expect to get back information  
about the particular link (the URLs on which it is found, the URLs to  
which it links, the number of times it would was clicked, etc. I might  
then assign the URI "www.example.com/linktracking/link20923/click" to  
the ping attribute. When the user clicks on an audited link the user  
is creating a new instance of a click event resource for the "www.example.com/linktracking/link20923/ 
" resource. Because we are creating a new instance, a POST is the  
correct method to use.

I used something of a RESTful way of speaking intentionally but the  
reasoning applies regardless. When a request is made of the ping URI,  
it is reasonable to say that it is creating a "new link audit event"  
regardless of how its implemented. Creating a new resource is always a  
POST. Otherwise, it would perhaps be beneficial to define a new HTTP  
method to better identify these sorts of requests and their meaning.  
 From what I read (http://www.w3.org/Protocols/rfc2616/rfc2616- 
sec9.html) it is allowed. I would suggest that AUDIT or PING would be  
natural (if not the best) choices.

I think that the feature, however, does not do enough to be  
sufficiently useful that it will supplant the existing javascript  
tracking. Specifically, knowing which link on which page may not be  
sufficient. It is a reasonable usecase to want to use a single PING  
URI for multiple destination URLs. For example, I may have a marketing  
campaign that I care about and I want to track user's interaction with  
the "Call To Action" link for that campaign. However the actual URL  
used for that link may change if the user is logged in. It is  
reasonable to want to know not only how many people clicked the call  
to action but also where they went. If the request included the  
destination URL, say in a header similar to REFERRER that can be  
disabled.

I'm also not clear on why only link tracking is singled out this way.  
I can think of two additional types of audit events that are "hidden"  
by JavaScript: page views and form abandonment. It seems to me that if  
its beneficial to expose link traversal tracking to users is  
beneficial, then it must also be beneficial for page view and form  
abandonment.

Adam
Received on Wednesday, 7 November 2007 17:00:39 UTC