Is Google Crawling Your Content? Check Google Translate

Dec 12, 2012 • 8:58 am | comments (7) by twitter Google+ | Filed Under Google Search Engine Optimization
 

Google TranslateTypically, if you are unsure if Google is able to crawl and index your content - you can login to Google Webmaster Tools and use the fetch as GoogleBot tool.

But if you do not want to use that tool, you can simply use Google Translate and see if Google Translate can access your site.

The tip came up in a Google Webmaster Help thread where John Mueller from Google explained to one webmaster that the reason Google is not indexing his site is because when GoogleBot tries to crawl it, the server is redirecting the bot to Google.com.

He showed this happening by sending him a link to Google Translate which shows the Google home page loading.

Clearly this is a move by a server guy to block spiders from crawling the site.

Tony McCreath tipped me off to this thread and in a private discussion area John from Google added (I am pretty sure he won't mind me sharing this despite it being in a private forum):

Google Translate is generally based on a real-time fetch of the page, the cached-version is usually a bit older. I think the feature-phone proxy also does something similar, but I'm not sure if that's really still around :).

Some hacked sites react to all kinds of Google IP addresses, so you can use tricks like this to trigger them. Others react to the user-agent (sometimes you can trigger them by using the Googlebot user-agent), others explicitly watch for Googlebot IP addresses, which you can't really check unless you're a verified owner in Webmaster Tools.

Since the preview images are generated either by Googlebot (when cached) or similar to Google Translate (when fetched live), sometimes you can reproduce quirks there using these tricks. It's not guaranteed, but it's a good tool to have on your side :)

Many of you probably knew this already but just in case.

Forum discussion at Google Webmaster Help.

Previous story: Google AdSense Now Warning Of Delayed Reporting
 
blog comments powered by Disqus