Peeking Into Google Reveals More of Google's Architecture

Mar 13, 2006 - 7:55 am 2 by

An article at InternetNews.com from March 2nd, has Google vice president of operations and vice president of engineering, Urs Hoelzle revealing some of the "behind-the-scenes tour of Google's architecture."

Bill Slawski at Cre8asite Forums created a thread on this article named Google's architecture, Informative news story where he pulled out a couple quotes.

Google replicates the Web pages it caches by splitting them up into pieces it calls "shards." The shards are small enough that several can fit on one machine. And they're replicated on several machines, so that if one breaks, another can serve up the information. The master index is also split up among several servers, and that set also is replicated several times. The engineers call these "chunk servers."

The company also is applying machine learning to its system to give better results. Theoretically, he said, if someone searches for "Bay Area cooking class," the system should know that "Berkeley courses: vegetarian cuisine" is a good match even though it contains none of the query words.

To do this, the system tries to cluster concepts into "reasonably coherent" subclusters that seem related. These clusters, some tiny and some huge, are named automatically. Then, when a query comes in, the system produces a probability score for the various clusters. This kind of machine learning has had little success in academic trials, Hoelzle said, because they didn't have enough data. "If you have enough data, you get reasonably good answers out of it."

The article is definitely worth a read and then join the forum discussion at Cre8asite Forums.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Gvolatility, Bing Generative Search, Reddit Blocks Bing, Sticky Cookies, AI Overview Ads & SearchGPT - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: July 26, 2024

Jul 26, 2024 - 10:00 am
Search Video Recaps

Google Volatility, Bing Generative Search, Reddit Blocks Bing, Sticky Cookies, AI Overview Ads & SearchGPT

Jul 26, 2024 - 8:01 am
Google

Google Gemini Adds Related Content & Verification Links

Jul 26, 2024 - 7:51 am
Other Search Engines

SearchGPT - OpenAI's AI Search Tool

Jul 26, 2024 - 7:41 am
Search Engine Optimization

Google's John Mueller: Don't Use LLMs For SEO Advice

Jul 26, 2024 - 7:31 am
Google

Google Search With Related Images Carousel Below Image Box

Jul 26, 2024 - 7:21 am
Previous Story: Possible Yahoo! Search Update