Google: Machine Learning Takes Care Of Most Obvious Spam

Jul 6, 2021 • 7:51 am | comments (1) by twitter | Filed Under Google Search Engine Optimization

On the latest Search Off The Record podcast, John Mueller, Gary Illyes and Martin Splitt had a special guest from the Google Search Quality team named Duy Nguyen. He said that Google has built a "very effective and comprehensive machine-learning model that basically took care of most of the obvious spam."

He said that this machine learning model deals with most of the spam that let's his team spend more of their time to "focus on more important work." He added that the spam machine learning model "basically took over all the heavy lifting" to tackle this most obvious spam.

At about 3:45 minutes here is what Duy Nguyen said:

So for such low quality or spammy content, it's relatively easy. If you're a person and you look at a page that's full of gibberish, or in this case, guest books with spammy posts, you should be able to say that emphatically, "Yes, this is spam," within seconds. Even if it's more complicated, with a trained eye, it should take less than a minute to determine something is spammy or not. And as Google, we have all these signals and all this data that we've accumulated and analyzed and studied over the years. So, you know, it's entirely possible to collect those datas to study it and build things like machine-learning models to tackle spam.

Machine learning model is interesting because it has so many use cases. It recommends music for you, you trust it enough to drive cars around so you don't have to drive. So building machine-learning models for spams turns out to be a pretty natural step for us.

So, yeah, we have so many data around, not just search result, but specially spam. So we were able to build a very effective and comprehensive machine-learning model that basically took care of most of the obvious spam. It basically took over all the heavy lifting so we can focus on more important work.

Here is the audio:

I wonder how the two spam updates from last week are related to this - was it updating the machine learning model or something new?

More Spam Topics

Later on in the podcast, Duy Nguyen said that hacked spam is still a problem for the ecosystem, and many sites are using outdated platforms and are easy targets. The hack spam they see today is not really much hacking, it is more just easy loopholes. You can sign up for Google Search Console to help you if you are worried about hack spam issues, Google will notify you when it is detected.

Duy Nguyen said one of the things that keep him up at night are the online scams, like Gmail customer support spam. Google is working hard on this but education for consumers is important.

They then talked about how you shouldn't copy spammers just because they may be ranking well, the spam is likely not the reason. Plus Duy Nguyen said that he hates to see webmasters focusing on external metrics (he didn't mention it but I suspect DA) and focus on improving that versus spending the time thinking about the overall site. Don't focus on one thing, one factor, do not focus on external signals.

