Most search spiders have been known to get a bit crawl happy from time to time. But the most complaints over time come from MSNBot which tends to often get out of hand and send their spiders on individuals sites...
Most search spiders have been known to get a bit crawl happy from time to time. But the most complaints over time come from MSNBot which tends to often get out of hand and send their spiders on individuals sites...
With all the on going issues with MSNBot not behaving, I am not too surprised to see more complaints about the little spider. New confirmed reports from Bing Forums shows that MSNBot is hiding itself under the UserAgent of Mozilla/4.0....
A Google Webmaster Help thread has an interesting discussion around blocking your site from coming up for both visitors and search engine crawlers on Shabbat (the Jewish Saturday). This is not a new topic, we discussed using cloaking for religious...
Shawn Hogan, DigitalPoint's founder, has posted a thread at DigitalPoint Forums clearly showing his frustration with MSNBot, Microsoft Bing's search crawler. He is upset that the bot is crawling too much, too fast - causing an unnecessary spike in load...
There is a thread I have been watching at the Bing Community where one member said that he had log files that shows MSNBot (Microsoft Bing's crawler) is clicking on Microsoft adCenter search ads, possibly charging him for those clicks....
There are several reports around the web about a new search bot by Microsoft that is causing major issues for web servers. The bot is named adidxbot and the useragent looks like this: adidxbot/1.1 (+http://search.msn.com/msnbot.htm). This bot has been on...
Back in the day, tracking how bots accessed your site was a bit of a crave. Now, you don't hear about it much. The old Google Analytics, aka Urchin, had a section for displaying bot activity on your site. It...
A HighRankings Forum thread asks why do some people use more than a single robots.txt file to control and instruct search spiders how to crawl and access their content. That is a good question. Typically, the spiders will only listen...
incrediBILL, moderator at WebmasterWorld, noticed that one of Live Search's bots was crawling through his JavaScript. The bot is named MSNBOT-MEDIA and he noticed that it was accessing JavaScript files and AJAX functions. He noticed that the bot was triggering...
There are threads at Google Groups and DigitalPoint Forums with multiple reports of Google not crawling Blogger hosted blogs, that are on custom or private domains (i.e. not on blogspot.com domains). Many have reported that the Googlebot crawling has stopped...
A WebmasterWorld thread reports Ask.com's crawler has seemed to slow down to a halt. Some webmasters are reporting zero crawling activity from Ask.com, while others are reporting extremely limited crawling activity. WebmasterWorld moderator, jdMorgan, noticed the slow down to, he...
On the first day of this month, we reported that Google and Yahoo were to begin indexing Flash files. According to the pertinent Google Webmaster Central blog post, Google is able to crawl the contextual elements in these blog posts....
Yesterday, we reported that Google's John Mueller said that if you block a whole region from accessing your site, it would be considered cloaking and thus be against Google's Webmaster guidelines. Since then, we have seen many comments on that...
A Google Groups thread has a webmaster who has been receiving a lot of rogue spider attacks from the Africa region. He wants to go as far as ban the whole continent of Africa. But he is concerned that by...
An unusual question came up at WebmasterWorld, asking if you can request a site to be completely removed from a country specific Google search engine. For example, the site owner wants to remove his site from Google Netherlands, because the...
Last night, I had a nice chat with Googler, JohnMu. I joked around with John, asking if he has messed up yet, in terms of Google communication with webmasters. He said not really - which I agree with. But he...
We should have seen this coming, based on the number of reports that Google was submitting GET forms. But often, it is hard to validate those types of reports, due to people spoofing Googlebot and similar tactics. In any event,...
A WebmasterWorld thread has an advertiser complaining that Google's AdWords spider appears to be lowercasing the destination URLs they have. The thing is, the lowercase URLs for this webmaster don't work with the site and they don't have the time...
A DigitalPoint Forums thread has dozens of reports that Yahoo Search's crawler, Yahoo Slurp, took some bad medicine recently. Many are reporting that they see the crawler spidering their sites like never before. Some times they have seen the spider...
Last December, we reported that MSNBot was failing a reverse DNS lookup. Well, guess what folks - MSNBot is failing again on some IP addresses. An updated WebmasterWorld thread brought this to my attention and I verified it myself. Here...
Just a tidbit based on a Google Groups thread, using the Google Remove URLs feature will only remove the content from Google for 90 days. After 90 days, if you do not block the page from crawlers or tell crawlers...
A WebmasterWorld thread reports that new installations of the popular blogging software, WordPress, is by default blocking all search engines. He said, when you go to the Privacy Options section in the administration panel, by default, it is set to...
A year ago, Microsoft promised to enable Webmasters a method of verifying MSNbot. Way too often, rogue spiders mask themselves as official spiders from Google, Yahoo, Live Search or Ask.com. The search engines have enabled methods to conduct reverse DNS...
Is there any mileage to the claim that it is possible to get your site spidered faster if you integrate the Google Custom Search Engine into your website? This is the question a new webmaster is asking on the High...
There is no doubt that a ton of bot activity on one's sites are from rogue spiders. Spider or bots that pretend to be legit bots but are there to steal your content. We have covered several sessions on this...
A WebmasterWorld member reports that his AdWords account has been terminated after being in good standing for four years due to "cookie spidering." He said he spends about $100,000 per year for the past four years, and all of a...
A Search Engine Watch Forums thread asks how can one prevent scraping of his site's content by a non-authorized spider, while not hurting his rankings in search engines? This is a serious issue, serious enough that there was a session...
To subscribe to the Search Engine Roundtable, click here