Most search spiders have been known to get a bit crawl happy from time to time. But the most complaints over time come from MSNBot which tends to often get out of hand and send their spiders on individuals sites...
Most search spiders have been known to get a bit crawl happy from time to time. But the most complaints over time come from MSNBot which tends to often get out of hand and send their spiders on individuals sites...
With all the on going issues with MSNBot not behaving, I am not too surprised to see more complaints about the little spider. New confirmed reports from Bing Forums shows that MSNBot is hiding itself under the UserAgent of Mozilla/4.0....
A Google Webmaster Help thread has an interesting discussion around blocking your site from coming up for both visitors and search engine crawlers on Shabbat (the Jewish Saturday). This is not a new topic, we discussed using cloaking for religious...
Shawn Hogan, DigitalPoint's founder, has posted a thread at DigitalPoint Forums clearly showing his frustration with MSNBot, Microsoft Bing's search crawler. He is upset that the bot is crawling too much, too fast - causing an unnecessary spike in load...
There is a thread I have been watching at the Bing Community where one member said that he had log files that shows MSNBot (Microsoft Bing's crawler) is clicking on Microsoft adCenter search ads, possibly charging him for those clicks....
There are several reports around the web about a new search bot by Microsoft that is causing major issues for web servers. The bot is named adidxbot and the useragent looks like this: adidxbot/1.1 (+http://search.msn.com/msnbot.htm). This bot has been on...
Back in the day, tracking how bots accessed your site was a bit of a crave. Now, you don't hear about it much. The old Google Analytics, aka Urchin, had a section for displaying bot activity on your site. It...
A HighRankings Forum thread asks why do some people use more than a single robots.txt file to control and instruct search spiders how to crawl and access their content. That is a good question. Typically, the spiders will only listen...
incrediBILL, moderator at WebmasterWorld, noticed that one of Live Search's bots was crawling through his JavaScript. The bot is named MSNBOT-MEDIA and he noticed that it was accessing JavaScript files and AJAX functions. He noticed that the bot was triggering...
A WebmasterWorld thread reports Ask.com's crawler has seemed to slow down to a halt. Some webmasters are reporting zero crawling activity from Ask.com, while others are reporting extremely limited crawling activity. WebmasterWorld moderator, jdMorgan, noticed the slow down to, he...
Last night, I had a nice chat with Googler, JohnMu. I joked around with John, asking if he has messed up yet, in terms of Google communication with webmasters. He said not really - which I agree with. But he...
We should have seen this coming, based on the number of reports that Google was submitting GET forms. But often, it is hard to validate those types of reports, due to people spoofing Googlebot and similar tactics. In any event,...
A DigitalPoint Forums thread has dozens of reports that Yahoo Search's crawler, Yahoo Slurp, took some bad medicine recently. Many are reporting that they see the crawler spidering their sites like never before. Some times they have seen the spider...
Just a tidbit based on a Google Groups thread, using the Google Remove URLs feature will only remove the content from Google for 90 days. After 90 days, if you do not block the page from crawlers or tell crawlers...
In August, Yahoo announced a new crawl behavior for Slurp, Yahoo's web crawler. The new crawl behavior was suppose to tame the crawler to go through your site in a more relaxed and efficient manner for both the crawler and...
A detailed Google Groups thread is reporting various reports of webmasters claiming GoogleBot is timing out before reaching their pages. First, these webmasters are noticing a drop in GoogleBot activity on their server. So they login to Google Webmaster Tools...
A Cre8asite Forums thread asks how can he generate unique robots.txt files for each domain he has, when each of those sites are sharing the same local files through a form of IIS mirroring? There are several ways to do...
Last week we reported on a Yahoo update and a new method of crawling. The new crawl behavior is supposed to help the Yahoo bot, Slurp, be more efficient on your site. It seems that many SEOs and Webmasters are...
The Yahoo! Search Blog has announced that webmasters will now see Yahoo's spider, Yahoo! Slurp, returning a new domain name in your logs. The same IP addresses now render to the domain name crawl.yahoo.net and no longer return the domain...
A WebmasterWorld and Search Engine Watch Forums threads are both reporting issues with Yahoo! Slurp (Yahoo!'s Crawler) indexing pages they should not be, and in quantities that may be harmful. It appears that only specific bots are not obeying the...
A Cre8asite Forums thread links to a blog post named GoogleBot Requested a CSS File. This is not the first time I heard threads where people suspect GoogleBot is crawling their CSS files. But this one has the most discussion...
A WebmasterWorld thread asks "How does Google determine which pages to crawl?" Google didn't always crawl and index pages as they do now. With the Big Daddy update Google adapted their crawl priorities, which was around April 2006. Google now...
An Adam Lasnik post in Google Groups sprung a post at Cre8asite Forums explaining that if you have bad HTML, Google will be OK with it. Yes, that is the case, your code does not need to be 100% validated...
The Google cache typically only stored about 100KB of your page. So if you had a heavy page with lots of content, not all of that page would be seen in the Google cache. That seems to have changed at...
We all know about PPC fraud and that some of the fraud is caused by bots (robots) that click on the ads and drive up your bill and unwanted traffic. But it gets more serious than that. Bot are also...
A Search Engine Watch Forums thread asks how can one prevent scraping of his site's content by a non-authorized spider, while not hurting his rankings in search engines? This is a serious issue, serious enough that there was a session...
How cute, seriously, MSN has finally given names to their baby crawlers. You know, Google names their crawlers, i.e. GoogleBot, MediaBot, etc... Yahoo has Slurp, etc. Now MSN has named their crawlers. The MSN Shopping bot is msnbot-products. The MSN...
To subscribe to the Search Engine Roundtable, click here