Scrape Bots Vs. Search Bots :: Fighting the Battle

Sep 12, 2006 - 7:06 am 1 by
Filed Under Cloaking

A Search Engine Watch Forums thread asks how can one prevent scraping of his site's content by a non-authorized spider, while not hurting his rankings in search engines?

This is a serious issue, serious enough that there was a session about this named The Bot Obedience Course at SES San Jose 2006. In that session, Bill Atchison from CrawlWall.com gave an excellent presentation.

Robert Charlton at the thread notes that Bill will be releasing a software tool that helps do just that. He said there is a "Beta version coming soon." The crawlwall.com/technology.html page has details of the technology developed by CrawlWall.com.

CrawlWall uses the following technology to secure your website and protect your content. All of the various methods are designed to work together in harmony to make sure that all of the spiders with permission and legitimate visitors get into your website without issue and all of the rogue crawlers get stopped and never gain admission.

Tactics such as dynamic robots.txt files, whitelist opt-in permissions, "second pass filters," ip banning or/and address banning, proxy blocking, creating certain obstacles, and a quarantine list for those uncertain IPs.

I am looking forward to seeing how it works in the real world.

Forum discussion at Search Engine Watch Forums.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Flux, AdSense Ad Intent, California Link Tax & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Video Recaps

Search News Buzz Video Recap: Google Core Update Flux, AdSense Ad Intent, California Link Tax & More

Apr 19, 2024 - 8:01 am
Google Ads

Google Tests More Google Ad Card Formats

Apr 19, 2024 - 7:51 am
Google Search Engine Optimization

Google: It's Unlikely Your Rankings Dropped Because You Have Two Websites

Apr 19, 2024 - 7:41 am
Google Search Engine Optimization

Google: Indexing API May Work For Unsupported Content But...

Apr 19, 2024 - 7:31 am
Google Search Engine Optimization

Google: Are Hyphenated Domains Bad For Google Rankings?

Apr 19, 2024 - 7:21 am
Bing Search

Bingbot To Test Zstd Compression After Fully Gaining Full Brotli Compression

Apr 19, 2024 - 7:11 am
Previous Story: Search Industry Pays Respect to 9/11