Re: New version of Quality Tips for Webmaster from Alex Rousskov on 2002-04-18 (www-qa@w3.org from April 2002)

From: Alex Rousskov <rousskov@measurement-factory.com>
Date: Thu, 18 Apr 2002 09:38:18 -0600 (MDT)
To: Karl Dubost <karl@w3.org>
cc: www-qa@w3.org
Message-ID: <Pine.BSF.4.10.10204180807390.640-100000@measurement-factory.com>
Hi Karl,

	The "GET versus POST" tip on the TODO list got my attention. I
followed the link and discovered two W3C references that incorrectly
interpret HTTP specs. I hope my comments prevent somebody from
repeating the same mistakes when this tip is implemented.

I also believe that the tip itself is wrong, but I do not think I can
change the consensus here.


1. http://www.w3.org/DesignIssues/Axioms#state

   The document has messy terminology and misinterprets HTTP. This
   comes as a surprise to me, because Tim Berners-Lee's name is on
   HTTP RFC 2616. Perhaps the old date (1996) explains the mess?
   If so, perhaps an appropriate comment could be added since
   people continue to link to this document. On the other hand,
   the RCS $Id$ tag says 2002/04/17. Go figure.

   Specific comments are below.


   "For example, the implication is that the GET operation in HTTP is an
   operation which is expected to repeatably return the same result."

	In HTTP, GET is described as a Safe method, but being HTTP-safe
	does not mean returning the same result. It means lack of
	side-effects significant to the user.

   "As a result of that, anyone may know that under certain circumstances
   that they may instead of repeating an HTTP operation, use the result
   of a previous operation."

        HTTP caching has little to do with safe methods or returning the
        same result. Even if each GET request generates a unique response
        and has 100 side-effects, that response may be cachable if the
	server says so.

   "The operation is "idempotent". This, in turn, allows software to use
    previously fetched copies of documents and it requires that the HTTP
    GET operation should have no side effects."

        "Idempotency" has very little to do with caching and not having
	side effects. HTTP idempotency means that N+1st request has the
	same side effect as Nth request. Idempotancy certainly does not
	require that GETs do not have side-effects!

   "In HTTP, GET must not have side effects."

	In HTTP, GET "SHOULD NOT" have side effects. That is, it is
	possible to be HTTP compliant (conditionally) and have
	side-effects for GET.

   "In HTTP, anything which does not have side-effects should use GET"

        HTTP actually defines several other methods that are safe (do not
	have side-effects): HEAD, OPTIONS, and TRACE.

   "[ repeatability of results ] leads to the whole concept of the Web as an
   information space rather than a computing program."

        Actually, computing programs usually produce repeatable results,
	given the same input. Information space can be created by
        computing programs, of course. Poor analogy?


2. http://www.w3.org/TR/html4/interact/forms.html#h-17.13


   "The "get" method should be used when the form is idempotent (i.e.,
   causes no side-effects)."

        Again, method idempotency is NOT lack of side-effects. It is
	lack of multiple side-effects. A very important distinction.

	For example, if requesting a GET URL causes a unique file to
	be deleted, the result is idempotent (several requests will
	result in the same state -- no file) but not safe (the file
	is gone).

        Another example is subscription to a mailing list. If the result
	of a GET request can subscribe me but only once in a life time,
	the request is idempotent but not safe. The forms.html document
	above says that any subscription via GET is not idempotent.
	

In general, I find that the desire to make Web safe and predictable for
the user somehow resulted in rigid requirements on HTTP methods and
their interaction with HTML. The goal (safe and predictable Web) can be
achieved by many ways. IMO, trying to severely restrict HTTP and HTML
use is a waste of effort because it handicaps Web site authors and they
would simply ignore these kinds of tips. The Web sites I visit seem to
support this theory (weel, at least they do not contradict it).

It is the _design_ of the Web site that should make that site safe and
predictable, and not whether the author uses GET or POST!

Many attribute success of the Internet to the original intention to
provide general communication mechanisms without really knowing (or
restricting) how those mechanisms would be used. This is related to the
famous end-to-end argument. It seems to me that the tips like the above
are moving us in the opposite direction:  restricting protocol use in
hope that applications will become better.


Thanks,

Alex.


On Thu, 18 Apr 2002, Karl Dubost wrote:

> Olivier and I have worked on the page Quality Tips for Webmaster.
> 	http://www.w3.org/2001/06tips/
> 
> There's now a process to help people to submit new tips and a template.
> Participation is welcome.
> 
> The original idea has been developped by Dan Connolly.
> 	http://lists.w3.org/Archives/Public/www-qa/2001Sep/0025.html
> 
> 
> A change from the Dan's version -> We would like that the validator 
> take the list of Tips from the page Overview itself.
> 
> 	http://www.w3.org/2001/06tips/Overview.html
> 
> In this page, you have the list of tips with two status.
> 
> 	<li class="tip">...blabla...</li>
> 		approved tip
> 
> 	<li class="draft">...blabla...</li>
> 		draft tip (in discussion on www-qa)
> 
> only class="tip" are usable by the HTML validator.
> 
> Thanks
> -- 
> Karl Dubost / W3C - Conformance Manager
>http://www.w3.org/QA/
> 
> --- Be Strict To Be Cool! ---
> 
>
Received on Thursday, 18 April 2002 11:38:21 UTC