SEO Considerations For SOPA Website Blackouts

Jan 17, 2012 • 8:40 am | comments (13) by twitter Google+ | Filed Under Google Search Engine Optimization

SOPA Blackout WikipediaMany sites are taking a stand with the SOPA movement by blacking out their web site for a 24 hour period on Wednesday, tomorrow.

Wikipedia is probably the most vocal one doing this, where they wrote:

The blackout is a protest against proposed legislation in the United States - the Stop Online Piracy Act (SOPA) in the U.S. House of Representatives, and the PROTECT IP Act (PIPA) in the U.S. Senate - that, if passed, would seriously damage the free and open Internet, including Wikipedia.

This will be the first time the English Wikipedia has ever staged a public protest of this nature, and it’s a decision that wasn’t lightly made.

So you want to make a stand but don't want to lose your shirt by doing so? Google offers advice on how to blackout (turn off) your web site for a day without hurting your short-term and long-term rankings in the Google search results.

Pierre Far from Google posted the advice on his Google+ and John Mueller from Google posted them at Google Webmaster Help. Here it is and I suggest you follow them carefully if you are blacking out your web site for SOPA.

tl;dr: Use a 503 HTTP status code but read on for important details.

Sometimes webmasters want to take their site offline for a day or so, perhaps for server maintenance or as political protest. We're currently seeing some recommendations being made about how to do this that have a high chance of hurting how Google sees these websites and so we wanted to give you a quick how-to guide based on our current recommendations.

The most common scenario we're seeing webmasters talk about implementing is to replace the contents on all or some of their pages with an error message ("site offline") or a protest message. The following applies to this scenario (replacing the contents of your pages) and so please ask (details below) if you're thinking of doing something else.

1. The most important point: Webmasters should return a 503 HTTP header for all the URLs participating in the blackout(parts of a site or the whole site). This helps in two ways:

a. It tells us it's not the "real" content on the site and won't be indexed.

b. Because of (a), even if we see the same content (e.g. the "site offline" message) on all the URLs, it won't cause duplicate content issues.

2. Googlebot's crawling rate will drop when it sees a spike in 503 headers. This is unavoidable but as long as the blackout is only a transient event, it shouldn't cause any long-term problems and the crawl rate will recover fairly quickly to the pre-blackout rate. How fast depends on the site and it should be on the order of a few days.

3. Two important notes about robots.txt:

a. As Googlebot is currently configured, it will halt all crawling of the site if the site's robots.txt file returns a 503 status code for robots.txt. This crawling block will continue until Googlebot sees an acceptable status code for robots.txt fetches (currently 200 or 404). This is a built-in safety mechanism so that Googlebot doesn't end up crawling content it's usually blocked from reaching. So if you're blacking out only a portion of the site, be sure the robots.txt file's status code is not changed to a 503.

b. Some webmasters may be tempted to change the robots.txt file to have a "Disallow: /" in an attempt to block crawling during the blackout. Don't block Googlebot's crawling like this as this has a high chance of causing crawling issues for much longer than the few days expected for the crawl rate recovery.

4. Webmasters will see these errors in Webmaster Tools: it will report that we saw the blackout. Be sure to monitor the Crawl Errors section particularly closely for a couple of weeks after the blackout to ensure there aren't any unexpected lingering issues.

5. General advice: Keep it simple and don't change too many things, especially changes that take different times to take effect. Don't change the DNS settings. As mentioned above, don't change the robots.txt file contents. Also, don't alter the crawl rate setting in WMT. Keeping as many settings constant as possible before, during, and after the blackout will minimize the chances of something odd happening.

Forum discussion at Google+ and Google Webmaster Help.

Previous story: comScore Search Share December 2011: Google Up & Yahoo Down
blog comments powered by Disqus