Google's Network Topology

Jul 27, 2005 • 8:27 am | comments (3) by twitter Google+ | Filed Under Google Search Engine
 

We do not know exactly the current setup Google has for its datacenters. But a thread at WebmasterWorld asks the question, "How are Google's servers connected?"

lammert provides an extremely helpful and well-written response to the question. I can not write it better myself, so I will quote it below.

Google operates a number of datacenters around the world. I am not sure about the exact number, but at the moment there are about 15. Each datacenter has one or more clusters, and each clusters consists of thousands of computers calculating the SERPs for your search query. When you do a query, you are connected with one of these data centers. Which one is determined by the DNS settings of the nameservers of Google called ns1.google.com ... ns4.google.com.

The DNS servers play an important role in the load distribution and disaster recovery. When you request the IP address for www.google.com, the DNS server first replies with a canonical name. This name has the form www.X.google.com where X is a letter. At this moment the name www.l.google.com is returned from the location where I am working, but this can vary depending on location and time.

Then a second query is done to translate this canonical name to an IP address. Every canonical name of the form www.X.google.com returns 3 IP addresses which can be used by the browser to attach to the search engine.

Throughout the day, you are not connected to the same data center or cluster. This is, because Google has decided to set an extremely short TTL (time to live) time for the canonical name and IP address. They have a good reason for it. If a cluster is overloaded or brakes down, they can route requests to another cluster or datacenter. Within 5 minutes (the TTL of the IP addresses) all clients will request a new IP address for www.google.com and all traffic is rerouted.

Some tests you can do yourself. This works on Windows 2000, but probably also on XP.

Start the command line program nslookup Type the command set d2 Type www.google.com

The program will now query the Google nameservers for the canonical name and the IP addresses for www.google.com. Because debugging is switched on with the set d2 command, you will also see the TTL times for the canonical name and IP's.

I really enjoyed reading that reply.

Previous story: Yahoo! Publisher Network Coming Soon?
 

Comments:

atef

11/08/2006 01:34 pm

I don't know if this article is missing information, but it is definitely a BIG help and something USEFUL for me. Thank you for posting it. Atef

amit

11/27/2007 08:51 am

thnx for the info bro. does the pc actually do two queries . i think it does only one query. ip addresses come in that query answer itself.

amit

11/27/2007 08:54 am

thnx for the intel .how many queries are actually done ? i beleive it is only 1. all ip are got in one query and answer.

blog comments powered by Disqus