Meet The Crawlers

Dec 8, 2005 • 12:54 pm | comments (0) by twitter Google+ | Filed Under Search Engine Strategies 2005 Chicago
 

Ramez Naam from MSN Search. - New Kid on Block - Launched Feb. 1 2005 - Web, Local, Toolbar, Deskto - Windows Live - Get external links or submit URL - Ensure pages are internally linked - Link to most important pages - Use Robots.txt - Keep URLs Human Readable (reduce query parameters, beware of session IDs) - Understand redirects (301s and 302s) - Don't rely on JavaScript - Unique, High Quality Content - Good Organic Links (descriptive text, links that a person would click on, the more natural a link the better) - Beware of using images for text, flash, and any deceptive optimization techniques. - Windows Live at live.com, shows how you can customize the page, search and add the search to your home page and search feeds and subscribe - Windows Live Local at local.live.com. With a new feature named "birds eye view" with actual aerial views from planes, change the angles and so on, very very impressive.

Kaushal Kurapati from Ask Jeeves - About Ask slide - Follows the Robots.txt rules - Efficiency tips - Freshness determines crawl rate - Completeness (pdfs, html, flash, ms-office, xml) - Date stamp content - Simplify site organization and navigation - Watch out for infinite pages - Have patience when it comes to getting indexed - JavaScript is a challenge - Dynamic pages can be an issue - URLs within images can't be followed

Tim Mayer from Yahoo - Mission statement slide - Link new URL from existing URL in the index - Make sure all URLs have an inbound link - Good authoritative links into a site to encourage deep crawls - Don't make site depth too extreme (3-4 levels is recommended) - Use the free addurl service if all else fails - Unique content (page titles, metatags, unique pages, multiple domains only when there are distinct businesses) - Avoid excessive doorway pages, keyword stuffing, keyword repetition, hidden text/link, link farms, cloaking - Yahoo has many crawlers (they are exclusive to each service) - Site Explorer Slide (talk about it here, here and here. - Local & Navigational Active Abstracts (he shows local vertical integration into SERPs, also Quick Links, and so on. - My Web product, social search, saving search and sharing results with friends (save to my web buttons can be put on your pages) - Yahoo Search Blog and Next.Yahoo.Com - He Mentions Answers.Yahoo.Com

Charles Martin from Google - Freshness, Comprehensiveness, Different Crawl Rates - Google Sitemaps, shows it off for crawl and error checking reasons - Show how to remove content from Google at webmasters/remove.html - Webmaster Guidelines slides - What if my site is moving, use 301 redirects - Googlebot uses too much bandwidth, respond 304 not modified - I want Googlebot to stay away (robots.txt - Gmail, Personalized Search, New Froogle Homepage, Images on Google News, Numrange, Google Local, Google Deskbar.

Q & A:

Q: On Rogue spiders, what are they? A: Danny answers it, but if you want to hear what he said, comment and ill add the info, its basic.

Some realllllllllly basic questions.....

Q: Google Sandbox, how do I get out of it? A: Charles said when he was listening to the podcast, he learned about the Jagger update thing. Internally there is no Jagger update, there are just improvements to the site. He said, whatever the marketers said, "i am behind it." They said that there is no google sandbox per-say, but its more about a series of filters that tries to figure out if a site is good or bad, etc.

Q: Is it cloaking to strip out session ids for just the bots? A: They all said it is "no problem". Google added, "in fact, please do that."

Q: Images that are linked, are they followed? A: MSN follows it, Google follows and recommends adding reasonable alt text.

Previous story: Local Search Marketing Tactics
 

Comments:

No comments.

blog comments powered by Disqus