Verify The Bots Accessing Your Site: Is Sending That GoogleBot?

Mar 7, 2007 • 7:13 am | comments (1) by twitter Google+ | Filed Under Google Search Engine

There is no doubt that a ton of bot activity on one's sites are from rogue spiders. Spider or bots that pretend to be legit bots but are there to steal your content. We have covered several sessions on this in the past; here are some:

A new Cre8asite Forums thread asks a question on how does one verify if GoogleBot is really from Google.

Matt Cutts posted a detailed How to verify Googlebot back at the Webmaster Central Blog on 9/20/2006 explaining how to do reverse DNS and then a forward DNS->IP lookup.

Telling webmasters to use DNS to verify on a case-by-case basis seems like the best way to go. I think the recommended technique would be to do a reverse DNS lookup, verify that the name is in the domain, and then do a corresponding forward DNS->IP lookup using that name; eg:

> host domain name pointer

> host has address

I don't think just doing a reverse DNS lookup is sufficient, because a spoofer could set up reverse DNS to point to

Of course there are some ways to automate this. Either code it yourself, buy CrawlWall or implement a solution similar to Ekstreme's PHP Search Engine Bot Authentication.

Rogue spiders are no fun, as we have seen in cases with some forums.

Forum discussion at Cre8asite Forums.

Previous story: Do MSN Search Reinclusion Requests Work?
blog comments powered by Disqus