How Does Load Balancing Figure Into SEO Efforts?

Aug 1, 2006 • 10:26 am | comments (4) by twitter | Filed Under SEO - Search Engine Optimization

Load balancing is a process used in the Internet world to help "ease the burden" on the servers of highly-trafficked sites. The idea is that when a visitor clicks on a URL, he or she may be redirected to a ww1, or a ww2 (identical) version of the website. As a surfer, you may have noticed this happening before and wondered why. So if you use load balancing, will it effect your search engine optimization efforts?

I actually started a thread on this subject at Search Engine Watch Forums recently, and have received some decent answers so far. Using the research and knowledge (and responses) of one of our developers, I asked fellow SEWans if Google Sitemaps may be the answer to solve the following problem with a load-balanced website:

...duplicate content that has been indexed. We have both ww1 and ww2 pages indexed, although mysteriously the ww2 has more pages in the index (almost all of them).

Although there have been some useful responses and solid suggestions, including to use "a reverse-proxy with url rewrite" and also:

Is it a scripted site? (If so) Modify the script so that it checks the requested URL...If the URL is www then add nothing....If the URL is www1 or www2 then add the [meta name="robots" content="noindex"] tag to the page...Eventually you will only have one set of URLs indexed.

I would love to get some more ideas from others that have dealt with this problem, especially if they have tried Sitemaps. Please join the discussion at Search Engine Watch Forums and feel free to link in the comments to any other discussion or article about using Sitemaps to help with load balancing issues with SEO. Another thread at SEW also discusses load balancing.

Previous story: Crediting Open Source Software With Live Links


Chris Beasley

08/01/2006 04:10 pm

That seems like a poor man's load balancing solution to me. Load balance the backend, not the frontend, it shouldn't be visible to end users or search engines.

chris boggs

08/01/2006 04:29 pm

Thanks Chris...some good ideas already being added to the discussion at the thread...

Jaimie Sirovich

08/01/2006 06:13 pm

I'm with Chris Beasley here. This sort of load-balancing is problematic. Doing a robots noindex/nofollow is not ideal either. The only thing that would potentially work is detecting the spider and always serving the "www" version of the content. However, strictly speaking, that's cloaking. Just doing a noindex/nofollow will disturb your linking structure intermittently AFAIK.

No Name

06/06/2008 03:13 pm

So I imagine you can load balance the server with no external changes in some way (according to comments above). This would be a good idea, the rewrite rule probably would have done the trick but why bother going into the complexities of that if in the first place the URLs are correct. I still see the occasional ww1 URL on some sites, but it seems like a second best solution if load balancing can be done just on the backend… I once observed someone’s solution to duplicated ww1 URLs, this was to 301 all of the ww1 pages to the homepage, no matter where they originally pointed (product pages, contact pages and news pages), I did offer advice but it fell on deaf ears, much like the internet marketing campaign for that particular site… Some people!

blog comments powered by Disqus