Google's Response On Browser Specific Cloaking

Nov 9, 2011 • 8:42 am | comments (4) by twitter Google+ | Filed Under Google Search Engine Optimization
 

google cloakingRecently, Matt Cutts posted a detailed video on cloaking but did not touch on the specific issue of serving up different content or pages based on the browser accessing the web page. He did discuss serving up content based on geo-location and mobile users.

One webmaster/developer asked a question in the Google Webmaster Help forums about browser specific cloaking and if it was allowed.

Google's Matt Cutts provided a rare response in the forum, answering the question in short by saying, "the easiest explanation is to serve Googlebot a safe version of your page that would work in any browser that you support with the expectation that someone viewing the cached version of a page could be using any browser."

But he provides a really long answer as well and I figured most of you have no idea he posted on it. So here is his answer:

In order to answer this question well, I need to talk a little bit about cloaking. The short definition of cloaking is showing different content to Googlebot than to users. Historically, Google has taken a very strong stance against cloaking because we believe it's a bad user experience. Google wants to fetch and judge the same page that users will see if they click on a site in search results. We also made a longer video if you’d like to learn more: http://www.youtube.com/watch?v=QHtnfOgp65Q .

Deceptive or malicious cloaking would be showing Googlebot a page of text about Disney cartoons while showing users pornography, for example. Typically to help site owners avoid cloaking, we recommend showing Googlebot the identical content that a site's typical desktop web browser would see. For example, if the most common web browser to a site is IE7, then provide Googlebot with the exact same page that IE7 would get.

That advice is less helpful for companies that provide performance optimizations at a granularity that varies from browser to browser. For example, Chrome supports things like data URIs (to provide inline data in web pages) or WebP images that other browsers don’t support. So websites could return those sorts of things for visitors surfing with Chrome. Then the question naturally emerges about how to treat Googlebot?

The main litmus test for cloaking is whether you are doing something special or different for Googlebot that you're not doing for other visitors or users. In the vast majority of typical cases, it's preferred to treat Googlebot like a typical not-too-cutting edge (think IE7, for example) browser. However, in the very specific case where you're offering specific performance improvements with browser-level granularity, it can be okay to optimize the page based on the user agent where Googlebot is just another user agent and you take into account the capabilities of the Googlebot "browser" in the same way that you do with other browsers. For example, one specific example would be to provide data URIs to Chrome browsers (which does support data URIs) but not to Googlebot (which currently doesn’t support data URIs).

The main questions that the webspam team--which is responsible for enforcing our quality guidelines about cloaking--will ask if we get a spam report concerning a page is - Is the content identical between user agents? (If the answer is yes here, you're fine.) - If not, how substantial are the differences and what are the reasons for the differences?

If the reason is for spamming, malicious, or deceptive behavior--or even showing different content to users than to Googlebot--then this is high-risk. For example, we already provide ways for servers returning Flash to show the same page to users and to Googlebot, so don't do this for Flash; see http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=72746#1 for more info about Flash.

Remember, I'm talking about very small differences for performance optimization reasons, such as inlining images. The way that I would recommend implementing browser optimizations like this would be the following: rather than specifically targeting Googlebot by UA string, build an affirmative list of browsers that support a given capability and then just treat Googlebot as you would a browser that wasn't on the whitelist.

Again, this guidance applies only when you’re delivering really granular performance optimizations based on the capabilities of individual browser types. Anything beyond that quickly becomes high-risk, and it would be a pretty good idea to check with Google before doing anything too radical in this space. The easiest explanation is to serve Googlebot a safe version of your page that would work in any browser that you support with the expectation that someone viewing the cached version of a page could be using any browser.

So there you have it - unsure, ask in the Google forums.

Forum discussion at Google Webmaster Help.

Image credit: Sam Cornwell / ShutterStock.

Previous story: Google Webmaster Hangout Today at 4pm EDT/1pm PDT
 

Comments:

Webnauts

11/10/2011 01:51 am

But what is if you hide from Googlebot portions of text or images, but not from the users?

Affordable SEO Services

11/10/2011 08:35 am

We do a lot of web site acceleration and for the most part we use techniques that can degrade gracefully and be sent to all browsers but there are some optimizations that will break pages for browsers that don't support them and we were wondering how we should serve content to googlebot and not get flagged for cloaking.  I have a specific example below, but to be clear, the user experience is identical in all cases - the text and links are identical and the loaded page in a browser is pixel-perfect across all versions, it's just the html structure that differs (and the performance).

Webnauts

11/10/2011 10:11 am

Can you explain what is nice about the information?

Wayne

12/12/2011 02:36 pm

Sweet! Thank you for providing this! Wayne

blog comments powered by Disqus