Who Is Better At Finding Duplicate Content?

Oct 4, 2007 • 12:38 pm | comments (1) by | Filed Under Search Technology

So who likes finding duplicate content more? Or who has the most to gain by finding duplicate content - Google or Copyscape? The forum members on Digitalpoint are having an interesting conversation on the differences in the ways Google and Copyscape find duplicate content. Most agree they use different algorithms to find duplicate content, but how hard can it be though? The consensus remarkably is that the majority of people see both companies as unique in their way to find duplicate content.

WebmasterWorld is also having some discussion on duplicate content. Namely, how do you deal with heavily copied content? Some of the members has some excellent advice on dealing with duplicate content these days.

1. Don't Go After the Content Scrapers

I rarely go directly after the copiers these days - instead I focus on strengthening the website itself. It's harder for copied content to beat a strong website - but from time to time it happens.
2. Go After the Web Hosting Company Instead
First, file DMCA notices against all US-based webhosts or server companies involved. Personally, I skip informing the webmaster first as he might not be US-based and it seldom works. Hosts and server companies normally take down whole sites or even servers (not just individual infringing pages) meaning potentially crippling losses for the webmaster renting a dedicated server from which he runs multiple scrapers.
3. Strengthen Your Website
The older a site is and the more pages and more trust it gains (along with measures to help deter scraping like using full urls, etc.) the less likely that having scraped content will cause any harm.
4. Don't Abandon The Original Content
Second, if you do re-write, don't abandon your original content. There's obviously a market for it, so arrange for it to be used on other sites, by agreement and with appropriate links.
5. Insert Your Website or Company Name Into The Content
I try and generally include a link to two in my content to another content page in my site, since most people copying do so by way of automation and pick up your link.
6. Consider That Eventually Someone Will Steal Your Content
i've personally given up on any rights on anything in any matter on the internet - I don't publish anything or put anything on the internet which I don't want to be re-distributed on a massive scale, be edited, laughed at, cried about, never quoted

Continued discussion at Digitalpoint and WebmasterWorld

Previous story: Would the Quality of DMOZ Improve if Nofollow Were Added?
Ninja Banner
blog comments powered by Disqus