Possible Google AdWords Slap on April 12, 2008? | Main | User Behavior Influenced By Blended Search Results

Google Now Crawling Content Behind Forms

We should have seen this coming, based on the number of reports that Google was submitting GET forms. But often, it is hard to validate those types of reports, due to people spoofing Googlebot and similar tactics. In any event, Google now admits to Crawling through HTML forms. Here are some things to know about this announcement in bullet form:

  • For select menus, check boxes, and radio buttons on the form, Google will choose from among the values of the HTML.
  • After gaining access to content pass the form, Google may or may not index that content
  • You can block Googlebot from crawling your forms by excluding them in your robots.txt file
  • Googlebot will only attempt to crawl GET forms
  • Googlebot tries to avoid forms requesting userids, login, passwords, contact information and so on
  • This should not impact PageRank

Matt Cutts of Google explains how this meets a need of so many webmasters that are clueless to SEO. In fact, from making the web more accessible, this new crawling technique rocks. But for SEOs and webmasters who want to block Google from accessing content, it will require some change on their part. I.e. they will have to restructure some of their sites to block Googlebot from crawling their forms.

The forum reaction is very mixed. We have threads at Sphinn, DigitalPoint Forums, Search Engine Watch Forums and WebmasterWorld.

Pros: Google can crawl places they haven't and index more of your content, which gives you more visibility.
Cons: Pages you do not want indexed, might require you do more work to block them.

The big joke in WebmasterWorld is that Googlebot now has a credit card. For example, if it can submit forms, maybe Googlebot will start messing around with conversions. Obviously, it won't place orders but what about submit a simple form that you consider to be a conversion (i.e. like user agreements or more)? In fact, I found GoogleBot filling out this Google Checkout form to buy itself some WD40 (kidding of course):

Googlebot Gets a Credit Card

But you get the point.

Forum discussion at Sphinn, DigitalPoint Forums, Search Engine Watch Forums and WebmasterWorld.

posted rustybrick in Google Optimization at April 14, 2008 7:50 AM Comments (6)

Comments

what does this mean exactly?

 

When you markup a web page, create a form with drop down boxes and input fields, Google can index those drop down and input field, names/id tags.

 

How is this news? Years ago I used to use this as a technique, make sure there was a forms page with little content in it but related key phrases in the drop-down menu. As I recall, the page did well for the phrases. These were only extant in the drop-downs, I don't believe there were inbounds using anchors to the form.

BB

 

Bill, are you joking? This is news.

 

Ah. About a fornight later and in another context, I begin to get it. Are we talking javascripted drop-downs and menus, the formerly untraversable to Google spiders kind of drop-downs and menus? That IS different.
BB

 

My immediate (if apparently belated) reaction is that Google are chipping away at my income by making aspects of what I do slowly redundant :-( Further I can no longer trust my spider-checkers to advise me on what can and can't be crawled. Google need really, in light of this, to offer a Google Spider simulator, a completely up-to-date and honest one. Fat chance, eh? I think we'll all still have to carry on as we have regarding spidering for pure site navigation - usability - in general because there's Yahoo and MSN to think of. Keeping content unindexable so your credit card details aren't immortalised by accident in The Wayback Machine, that may well have got disturbingly harder. We'll all be back to using phones for everything confidential soon. This could well be a step forward that has us fleeing back into the past.

BB

 

Post a comment

Do you want us to save your personal Information?


To subscribe to the Search Engine Roundtable, click here