W3C home > Mailing lists > Public > www-tag@w3.org > November 2011

Re: Googlebot doing POST

From: David Booth <david@dbooth.org>
Date: Wed, 02 Nov 2011 10:43:22 -0400
To: Karl Dubost <karld@opera.com>
Cc: "www-tag@w3.org List" <www-tag@w3.org>
Message-ID: <1320245002.30661.1365.camel@dbooth-laptop>
On Wed, 2011-11-02 at 07:29 -0700, Karl Dubost wrote:
> FYI
> The key sentence
> 
> 	So, while GET requests remain 
> 	far more common, to surface more content on the 
> 	web, Googlebot may now perform POST requests when 
> 	we believe it’s safe and appropriate. 

. . . thus *encouraging* the inappropriate use of POST by rewarding
sloppy web publishers.

IMO, Googlebot should *not* perform POST requests.  As a consequence if
a publisher's content does not appear in Google's search results, that
will provide a strong motivator to that publisher to fix their pages.

David

> 
> 
> 
> On Wed, 02 Nov 2011 14:23:18 GMT
> In Official Google Webmaster Central Blog: GET, POST, and safely surfacing more of the web
> At http://googlewebmastercentral.blogspot.com/2011/11/get-post-and-safely-surfacing-more-of.html
> 
> As the web evolves, Google’s crawling and indexing 
> capabilities also need to progress. We improved 
> our indexing of Flash, built a more robust 
> infrastructure called Caffeine, and we even 
> started crawling forms where it makes sense. Now, 
> especially with the growing popularity of 
> JavaScript and, with it, AJAX, we’re finding more 
> web pages requiring POST requests -- either for 
> the entire content of the page or because the 
> pages are missing information and/or look 
> completely broken without the resources returned 
> from POST. For Google Search this is less than 
> ideal, because when we’re not properly discovering 
> and indexing content, searchers may not have 
> access to the most comprehensive and relevant 
> results.
> 
> We generally advise to use GET for fetching 
> resources a page needs, and this is by far our 
> preferred method of crawling. We’ve started 
> experiments to rewrite POST requests to GET, and 
> while this remains a valid strategy in some cases, 
> often the contents returned by a web server for 
> GET vs. POST are completely different. 
> Additionally, there are legitimate reasons to use 
> POST (e.g., you can attach more data to a POST 
> request than a GET). So, while GET requests remain 
> far more common, to surface more content on the 
> web, Googlebot may now perform POST requests when 
> we believe it’s safe and appropriate. 
> 
> 

-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Wednesday, 2 November 2011 15:43:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:40 GMT