Google Search Console Uses Bloom Filters For Faster Reporting

Sep 7, 2023 - 7:31 am 1 by

Google Flower Bloom

Gary Illyes from Google was asked why is the filtered data higher than the overall data within Google Search Console? In which Gary explained how the filter works - specifically - it uses a "Bloom filter."

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set.

Gary said the filter is used because it is an efficient and fast way to process a ton of data and process a lot of stored data.

Gary said at the 1:13 mark into the Google SEO office hours video, "The short answer is that we make heavy use of something called Bloom filters because we need to handle a lot of data and Bloom filters can save us lots of time and basically storage."

He added, "The long answer is still that we make heavy use of Bloom filters because, again, we need to handle a lot of data but I also want to say a few words about Bloom filters. When you handle a large number of items in a set, and I mean billions of items if not trillions, sometimes looking up things fast becomes super hard. This is where Bloom filters come in handy. They allow you to consult a different set that contains a hash of possible items in the main set, and you look up the data there in your smaller set since you are looking up hashes first."

"It’s pretty fast, but hashing sometimes comes with data loss, either purposefully or not. And this missing data is what you're experiencing. Less data to go through means more accurate predictions about whether something exists in the main set or not. Basically, Bloom filters to speed up lookups by predicting if something exists in a data set but at the expense of accuracy, and the smaller the data set is, the more accurate the predictions are," he added."

Here is the video embed at the start time:

Oh, the jokes on the Google Bloom filter have begun:

Bloom Filter Google Jokes Mastodon

Forum discussion at Twitter.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Flux, AdSense Ad Intent, California Link Tax & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: April 25, 2024

Apr 25, 2024 - 4:00 pm
Google Updates

Google March Core Update Still Rolling Out & Heated SEO Chatter Continue

Apr 25, 2024 - 7:51 am
Google

Report: How Prabhakar Raghavan Killed Google Search

Apr 25, 2024 - 7:41 am
Google Search Engine Optimization

Google Favicon Documentation Adds Rel Attribute Value Definitions

Apr 25, 2024 - 7:31 am
Google Ads

Google Ads API Version 16.1 Now Available

Apr 25, 2024 - 7:21 am
Google Search Engine Optimization

Google: Splitting & Merging Sites Takes Longer Than Normal Site Migrations

Apr 25, 2024 - 7:11 am
Previous Story: Google Ads Requires Election Ads To Disclose Synthetic/AI Content