Google Search Console Uses Bloom Filters For Faster Reporting

Sep 7, 2023 - 7:31 am 1 by

Google Flower Bloom

Gary Illyes from Google was asked why is the filtered data higher than the overall data within Google Search Console? In which Gary explained how the filter works - specifically - it uses a "Bloom filter."

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set.

Gary said the filter is used because it is an efficient and fast way to process a ton of data and process a lot of stored data.

Gary said at the 1:13 mark into the Google SEO office hours video, "The short answer is that we make heavy use of something called Bloom filters because we need to handle a lot of data and Bloom filters can save us lots of time and basically storage."

He added, "The long answer is still that we make heavy use of Bloom filters because, again, we need to handle a lot of data but I also want to say a few words about Bloom filters. When you handle a large number of items in a set, and I mean billions of items if not trillions, sometimes looking up things fast becomes super hard. This is where Bloom filters come in handy. They allow you to consult a different set that contains a hash of possible items in the main set, and you look up the data there in your smaller set since you are looking up hashes first."

"It’s pretty fast, but hashing sometimes comes with data loss, either purposefully or not. And this missing data is what you're experiencing. Less data to go through means more accurate predictions about whether something exists in the main set or not. Basically, Bloom filters to speed up lookups by predicting if something exists in a data set but at the expense of accuracy, and the smaller the data set is, the more accurate the predictions are," he added."

Here is the video embed at the start time:

Oh, the jokes on the Google Bloom filter have begun:

Bloom Filter Google Jokes Mastodon

Forum discussion at Twitter.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Coming, Ranking Volatility, Bye Search Notes, AI Overviews, Ads & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: July 25, 2024

Jul 25, 2024 - 10:00 am
Google Ads

Google Again: We Will Test Ads In AI Overviews Soon

Jul 25, 2024 - 7:51 am
Bing Search

Microsoft Now Testing Bing Generative Search Experience

Jul 25, 2024 - 7:41 am
Bing SEO

Reddit Blocked Bing Search & Others But Not Google

Jul 25, 2024 - 7:31 am
Local Search

Apple Maps Web Version Launches Beta

Jul 25, 2024 - 7:21 am
Google Ads

Google Local Service Ads Shows Phone Number On Hover

Jul 25, 2024 - 7:11 am
Previous Story: Google Ads Requires Election Ads To Disclose Synthetic/AI Content