Google Search Console Uses Bloom Filters For Faster Reporting

Sep 7, 2023 - 7:31 am 1 by

Google Flower Bloom

Gary Illyes from Google was asked why is the filtered data higher than the overall data within Google Search Console? In which Gary explained how the filter works - specifically - it uses a "Bloom filter."

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set.

Gary said the filter is used because it is an efficient and fast way to process a ton of data and process a lot of stored data.

Gary said at the 1:13 mark into the Google SEO office hours video, "The short answer is that we make heavy use of something called Bloom filters because we need to handle a lot of data and Bloom filters can save us lots of time and basically storage."

He added, "The long answer is still that we make heavy use of Bloom filters because, again, we need to handle a lot of data but I also want to say a few words about Bloom filters. When you handle a large number of items in a set, and I mean billions of items if not trillions, sometimes looking up things fast becomes super hard. This is where Bloom filters come in handy. They allow you to consult a different set that contains a hash of possible items in the main set, and you look up the data there in your smaller set since you are looking up hashes first."

"It’s pretty fast, but hashing sometimes comes with data loss, either purposefully or not. And this missing data is what you're experiencing. Less data to go through means more accurate predictions about whether something exists in the main set or not. Basically, Bloom filters to speed up lookups by predicting if something exists in a data set but at the expense of accuracy, and the smaller the data set is, the more accurate the predictions are," he added."

Here is the video embed at the start time:

Oh, the jokes on the Google Bloom filter have begun:

Bloom Filter Google Jokes Mastodon

Forum discussion at Twitter.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Ongoing Google March Core Update, Googlebot To Crawl Less, Pay For Google Search AI & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Google Search Engine Optimization

Google: Indexing & Algorithm Updates Are Independent

Apr 12, 2024 - 7:51 am
Google Search Engine Optimization

Google Structured Data Carousels Beta Docs Clarifies Feature Availability & Markup Location

Apr 12, 2024 - 7:41 am
Google

Google: Ranking In Shopping, Images & Other Verticals Doesn't Hurt Your Web Rankings

Apr 12, 2024 - 7:31 am
Google

Google Knowledge Panels - Mentioned People

Apr 12, 2024 - 7:21 am
Google Maps

Google Maps Suggest An Edit Flow Updated

Apr 12, 2024 - 7:11 am
Search Engine Conferences

Recap of Google's Search Central Live Romania 2024

Apr 12, 2024 - 7:05 am
Previous Story: Google Ads Requires Election Ads To Disclose Synthetic/AI Content