Re: whenToUseGet-7 counter-proposal from Mark Nottingham on 2002-04-24 (www-tag@w3.org from April 2002)

From: Mark Nottingham <mnot@mnot.net>
Date: Tue, 23 Apr 2002 22:57:46 -0700
To: "Joshua Allen" <joshuaa@microsoft.com>
Cc: <www-tag@w3.org>
Message-Id: <33330B0A-5748-11D6-A8C1-000A27836A68@mnot.net>

On Tuesday, April 23, 2002, at 10:28  PM, Joshua Allen wrote:

>> Regarding your proposed language, if systems cannot rely on HTTP GET
>> being safe, how will caching and crawling work at all?
>
> Most only cache and crawl URIs that don't have a querystring.  You
> answered it yourself by saying "just don't submit forms".  If a form
> element says METHOD=GET, the parameters are going to be embedded in the
> querystring.  As a number of others have pointed out, the difference
> between METHOD=GET and METHOD=POST is irrelevant to most modern web
> server programming platforms (ASP, PHP, Coldfusion, JSP, Servlets etc.)
> When a developer decides to use a GET instead of a POST in his form, he
> has no idea that it should be idempotent.  In retrospect, it probably
> would have been smart for the tools to be designed to make this
> distinction clearer.

My point was that URIs with query strings don't need to be generated by 
a form.


> This is just the way things are today; caching and crawling do not trust
> POST, and they do not trust querystrings.  Both are assumed to have
> potential side-effects.  It is possible that some edge caches will try
> to cache responses from URIs with querystrings, and maybe my experience
> with this is negative simply because pages that are dynamically created
> through server code and form fields (as any with METHOD=GET) typically
> set the cache control headers to no cache.  In fact, I once saw a
> situation with a prominent (non-Microsoft) ISP who was accidentally
> exposing customers' credit card numbers to one another because they were
> incorrectly caching dynamic content by *ignoring* the cache control
> headers.  This was a fairly arcane bug, but you can bet credit cards
> would have been more widely compromised if this ISP had blindly cached
> any URI (they definitely did *not* cache URIs with querystrings unless
> the cache-control headers permitted it).

Interesting. My experience is completely different, and I wouldn't refer 
to that as an arcane bug at all.


> And any crawlers I have used are deliberately designed to ignore URIs
> with querystrings.

See Paul's reference re: Google. I'd seen the same behaviour, but didn't 
have an example so handy. (Thanks, Paul!)


--
Mark Nottingham
http://www.mnot.net/

Received on Wednesday, 24 April 2002 01:57:50 UTC