The Invisible Spider: Covert Crawler

Jan 17, 2006 - 8:52 am 0 by

A thread over at Cre8asite forums named New kind of spider is in town links to a Wired article named Covert Crawler Descends on Web. In short, this article describes a new kind of spider designed to crawl the Web as human-like as possible.

How Does it work?

The program comes from different internet addresses, simulates different browsers and throttles itself to human-like speeds... Hoffman's program downloads everything that comes with a page -- images, JavaScript and components like ActiveX and Flash -- instead of just hitting the page itself like traditional spiders do. It also simulates a full web browser, keeping a cache and requesting only new material... To select which links to click on, Hoffman has settled on a solution somewhere between a masterful AI and completely random selection. "In some ways it's a very simplified Turing test -- you can assign the different threads a personality. This crawler, you're the slow reader, you read the entire page." Another thread may spend less time on a page before it starts clicking on different links. "Each individual crawler has its own browser habits," he added.

Barry Welford calls this spider, "somewhat scary" and that I agree with. Ron Carnell has it right, "any robot that doesn't ask for and then follow robots.txt is, by definition, unethical." So Ron gives you a technique you can use to track and then block this type of bot.

Forum discussion at Cre8asite Forums.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Google Core Update Coming, Ranking Volatility, Bye Search Notes, AI Overviews, Ads & More - YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: July 25, 2024

Jul 25, 2024 - 10:00 am
Google Ads

Google Again: We Will Test Ads In AI Overviews Soon

Jul 25, 2024 - 7:51 am
Bing Search

Microsoft Now Testing Bing Generative Search Experience

Jul 25, 2024 - 7:41 am
Bing SEO

Reddit Blocked Bing Search & Others But Not Google

Jul 25, 2024 - 7:31 am
Local Search

Apple Maps Web Version Launches Beta

Jul 25, 2024 - 7:21 am
Google Ads

Google Local Service Ads Shows Phone Number On Hover

Jul 25, 2024 - 7:11 am
Previous Story: Yahoo Submit Your Site Still Timing Out