What is Behind the Speed of Google's Search Engine?

Nov 2, 2007 • 10:37 am | comments (2) by twitter | Filed Under Google Search Engine
 

When search engines parse through billions of websites to find results that are relevant, it's somewhat miraculous, to say the least. A WebmasterWorld thread asks, "how does Google do it?"

There are a few parts to this answer: the Google File System is "a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients."

Additionally, one member suspects that Google's entire index is stored in RAM (!!!), though that sounds pretty hard to believe and another member disputes this claim.

Other factors include datacenters throughout the world, a clean interface without too many graphics, and of course, the algorithm. In case you're not familiar with this element of computer science, a well-coded and well-executed algorithm can locate millions of entries in a short amount of disk accesses.

Even more, Google uses column-oriented databases (not row-oriented databases that most webmasters use), and according to another member, Google has eliminated middle-tier hosts and has its datacenters plugged in directly to the Internet like powerful ISPs. Their datacenters are strategically placed close to densely populated areas.

It's certainly cool how all these factors are contributing to the Google experience.

Forum discussion continues at WebmasterWorld.

Previous story: Webmasters Report November 2007 Google SERP Changes
 
blog comments powered by Disqus