Google's Matt Cutts: That Scraper Isn't Hurting Your Mom's Site

Oct 23, 2013 • 8:40 am | comments (27) by twitter Google+ | Filed Under Google Search Engine Optimization
 

Google Scraper AlgoGoogle's head of search spam, Matt Cutts, responded to a Hacker News thread where Dan Goldin was upset with spammer SEOs when he noticed a site was scraping the content and images off of his mother's site.

In short, Matt Cutts responded that the scraper site is not benefiting from the stolen content. Matt said the site is not getting any traffic from Google, from what he can tell.

Matt wrote:

Hey Dan, it's annoying when a someone scrapes your website, and it's even more annoying that this site tried to claim your Mom's images with Getty. But as far as I can tell, the site that you mentioned isn't having any success at all. For all the work that this spammer put into copying your Mom's site, it doesn't look like they're getting even a single digit number of visitors from Google. So they may be annoying, but their attempts at spamming didn't do them any good at all.

Feel free to do a DMCA request, but I'm already passing this to my team as a spam report and we'll dig into it. I don't know how Getty deals with scrapers though, so you'll need to look into that on your side.

I made a video with advice about dealing with scraping a few years ago that might be useful for you: https://www.youtube.com/watch?v=5CosWAVLCZg

It is interesting to see Matt say, "it doesn't look like they're getting even a single digit number of visitors from Google." Why not? How does he know the site's analytics? The site is indexed by Google, but Matt knows the site isn't getting any traffic from Google. I guess he has ways of knowing by looking at the click through data that one would find in Webmaster Tools?

Either way, here is Matt's video on scrapers:

Some other fun reading related to this:

Forum discussion at Hacker News.

Previous story: Western Union Declines 5% Of Google AdSense Payments This Month
 

Comments:

Michael Merritt

10/23/2013 01:55 pm

The problem is that there are plenty of anecdotes from webmasters of scrapers successfully ranking above the original content. Maybe it won't happen with this guy's mom's site, but perhaps it will with another.

Paid Liar Cutts

10/23/2013 02:30 pm

Matt Cutts gets paid to lie and to defend Google. If people believe that siteB can copy siteA and crash their ranking then it's a problem so Matt jumps in to lie. scrappers absolutely hurt you, even if they don't get much traffic. Barry give Matt my IP

Jenny Halasz

10/23/2013 02:36 pm

I find it interesting that he says "the site may link back to you and then that won't hurt you." Yet I have seen plenty of instances of scrapers hurting a site by linking back to it. If A=B and B=C, then A=C. If a spammy domain linking to you can hurt you, and if a site is scraping your content and linking back to you, and spamming that domain to try and rank better than you, then you have thousands of links from a "bad" domain. So how does that not hurt you? Google makes no sense sometimes.

Donna Fontenot

10/23/2013 02:38 pm

He fails to mention in the video that those links in scraped articles will fall into the category of Penguin-ized low quality links, and that the dup content might now be a Panda factor against the original site. He made the video "a few years" ago...before Penguin and Panda started making scraped content a bigger problem than before.

Michael Martinez

10/23/2013 02:56 pm

Scrapers are a problem but the links don't always hurt. If Google has enough data available to analyze they can see what's going on. Popular sites get scraped all the time but they don't lose their search visibility. I would say it's the smaller sites that are more vulnerable to the negative side effects of scraping.

JustConsumer

10/23/2013 03:27 pm

"Yet I have seen plenty of instances of scrapers hurting a site by linking back to it. " I'm curious, how did you make the conclusion the website was hurt by scrapers, linking back to it, when hundreds of factors could influence the website ? How can you be sure, saying this particular factor influenced the website, when you even don't know for sure the whole list of factors Google uses to rank ? re "thousands of links from a "bad" domain" It can't hurt website in good condition. Was proved already.

JustConsumer

10/23/2013 03:49 pm

Exactly. These are anecdotes. I can even tell you how they were born. So, the other day plumber John read an article, how constructor Bob has side income running his blog. John also wants side income. After reading of some more articles (mostly from SEO gurus), he starts his own blog on popular blog hosting or using popular CMS. He writes 10 articles (by the way helpful articles) about plumbing. AdSense placed. It's time to check income. Hm ... Nothing. Three months past, but there are just pennies earned. It's time to start the search where the problem is. And this is where the anecdote starts. The problems appeared to be everywhere - scrappers, negative SEO, greed Google, government doesn't act to protect plumbers on the Internet ... etc etc etc Anything, except the only real problem - John is PLUMBER. Same anecdotes take place when WEB DEVELOPERS do the plumber job.

Fedor

10/23/2013 04:13 pm

We'll never be able to get rid of scrapers. Most of them are off-shore and DMCA won't do a thing. We just have to put up with it and hope search engines devalue the scraper sites so they don't rank well.

Michael Merritt

10/23/2013 04:34 pm

Huh? I'm talking about people who have content scraped from their existing site and then the scraper ranks above them.

JustConsumer

10/23/2013 05:03 pm

And I'm talking, that scraper ranks higher not because of the content scraped, but because the content owner has no clue how to run business. I, being the publisher for almost ten years, never had problems with scrapers. But for sure I would have problems, if I would do plumber job )

JustConsumer

10/23/2013 05:08 pm

Why hope, when two easy steps can eliminate this problem forever ? 1. terminate session when quantity of requests per period of time is higher than the limit ; 2. make RSS delayed (if it's used) That's it )

Jenny Halasz

10/23/2013 07:20 pm

Fair point, but you're assuming the website was in good condition. I work all the time with sites that have been penalized for links (it's become sort of a specialty of mine) and I see this often. While it's true it's not possible to say 100% that a network of bad links from scrapers is what's hurting you, we've seen positive results from either getting those scraper sites taken down or disavowing them if our takedown efforts fail.

Fedor

10/23/2013 07:21 pm

That will accomplish absolutely nothing. They'll just keep going and going until they have everything. They'll even setup a custom timer just for you. These are people that have no morals and all the time in the world.

JustConsumer

10/23/2013 07:52 pm

Then analyze your logs and deny access to your site by certain parameters. Sure they can use fake IPs, headers, etc, but make their life hard. Make it pricey for them to deal with you. Once it took me almost two weeks of manual work fighting with bad guys. It was even fun, like brain competition ) I can't say, that I won, but for sure I made it too expensive for them to deal with me and they left. The point is - there are ways to deal with such situations, but one has to have certain knowledge. If one consider online activity as business, then appropriate knowledge is a must. If this is not business, then scrapers are best friends, because they help to share information one wants to share.

DarthVader

10/23/2013 07:55 pm

Google itself is just one big scraper site. What is Matt talking about? How does google get content snippets? The big scraper says to the little unknown scraper "Thou Shalt Not Scrape and get thy own satellite from scraper profits!"

JustConsumer

10/23/2013 08:16 pm

If your clients are happy with your service and if they can afford it then it's good. Everyone wastes own money as s/he wishes. I just against the spread of the myths. If you agree, that "it's not possible to say 100% that a network of bad links from scrapers is what's hurting you", then obviously it would be correct to say "Yet I have seen plenty of instances of scrapers POSSIBLY hurting a site by linking back to it. " Just not to mislead others ) Too many myths and speculations in this industry )

CaptainKevin

10/23/2013 09:08 pm

Google's appspot proxies are used to scrape and they do outrank the original content. Scraping may not work unless the scraped content resides on a Google owned property like an appspot proxy or blogspot. Pure theft IMO.

Boycottgoogle

10/24/2013 10:16 am

What if for whatever reason your website goes down for a bit.... you go on holiday not realizing because everyone needs a holiday for a couple weeks.. come back oh no.. will the scrapers now take your place as the originals? That is the question.

Oscar Bannister

10/24/2013 10:18 am

They might be able to stop scrapers, but they can't stop sites with more authority re-writing popular content and outranking the smaller sites who initially had the idea.

Rob jH

10/24/2013 11:56 am

Is Authorship not meant to be the solution for this?

Jenny Halasz

10/24/2013 02:31 pm

Correlation vs. causation. I see a strong correlation that I'm ready to say is causation. You may not agree. I resent the implication that my clients are wasting their money with me and don't wish to discuss things with you any further if you can't be nice. Taking my ball and going home.

JustConsumer

10/24/2013 03:16 pm

I'm nice ) I nicely see SEO as an evil ) .... Can I still play your ball ? )

Fedor

10/24/2013 04:52 pm

Most of them have proxies built into the scrapers so it doesn't cost them anything. Dealing with massive logs to find patterns is not practical, for a site that only get 5k/day maybe. Either way, whoever you were dealing with were amateurs at best. There is absolutely nothing you can do to prevent someone from scraping your site if they really wanted to.

Jenny Halasz

10/24/2013 06:24 pm

haha. I guess we can agree to disagree.

JustConsumer

10/24/2013 07:52 pm

"There is absolutely nothing you can do" That's for sure, anything can be scraped or hacked, but this is different story ... or, to be more precise, different level )

Graciousstore

10/25/2013 04:42 am

The truth is that people copy other people's content and simply re-write them so that they don't look exactly like the ones they were copied from

Ratan Bhutnai

10/25/2013 10:22 am

Don't know what google is doing my traffic is 85% down..... i haven't got any mail from google...... i am very sure my content is authenticate earlier i were leading the market, now i don't have any presense in google

blog comments powered by Disqus