Coffee Talk with Senior Google Engineer : Matt Cutts

Nov 16, 2005 • 12:56 pm | comments (13) by twitter Google+ | Filed Under WebmasterWorld 2005 Las Vegas
 

Brett introduces the day and this session.

This session is where we pound Matt Cutts with questions, or not... we will see.

He explained that in 99 or so, he posted a thread telling the search engines to come talk to us (the webmaster). So Matt Cutts came into the forums and posted, it shocked the forum. It has completely changed the industry. Matt is known for writing the adult content filter at Google. Then he calls Matt up. He was drinking a red bull at 9am in the morning.

Q: How do you like working for Google? A: It is a lot of fun, it is still a lot of fun.

Q: What is your employee number? A: Within the first 100.

Q: How does Google feel about SEOs, SEMs, Webmasters? A: At times there is an element of conflict. In Matt's mind, its best to work with Webmasters. He thinks as SEO and spam as two different things. Spam is outside of their guidelines and they don't like that. Anyone who is whitehat or tweaking keywords or making a site navigation more crawlable are good. SEO is not spam, its only when you go against guidelines, when it is spam. There is a large online publisher that wasn't doing well in Google. They changed the robot.txt file that said, no search engines can crawl the site. That is why. Changing your robot.txt file is not spam.

Q: Can we get a tag that lands all search engines except for Google? There are so many exceptions that can be put in. A: The wonderful thing about SEO is that you can test so many things. He thinks that if you put in disallow * all, then add allow GoogleBot, GoogleBot may (he thinks) crawl - it may look for the more specific rule. He allows wildcards as well.

Q: 301/302 redirect issues, sandbox, supplemental results...Where are we with all that? A: We are better off today, we are making progress. We brought 20 engineers to New Orleans and we got your feedback. Same at SES Google Dance. We are working towards a framework where we are indexing the destination. He compares the Yahoo slides (ill try to bring them up). They are testing this at a datacenter, not sure which IP its at.

Q: Is that is what with Jagger? A: No, that is something else.

Q: Does the sandbox exist? A: Matt said here comes the audience part? How many feel there is a sandbox? How many feel there is no such thing as a sandbox? SEOs normally split down the line. There are some things in the algorithm that may be perceived as a sandbox that doesn't apply to all industries. He knows it works to keep some spam out.

Q: DMOZ; are you guys going to take it over? A: Matt doesn't want to predict the future and he is just an engineer. If he had to predict, he would think no.

Q: Duplicate content, stolen content. What can we do to protect ourselves? A: We watch what people are saying about this. They have projects on the way to determine who first wrote this text, its not a 100% done, but its on the radar.

Q: Blogs...Its the internet version of the vast wasteland. Is Google doing anything specific to clean up this index? A: There is a lot of stuff we are looking at. Splogs are bad. The Web spam team has been working with Blogger, and have made lots of progress with that. Volume of spam decreased.

Q: Do you guys ever do hand tweaks of the results? A: For the most part, we let the algorithm do all the work. However, Google News uses editor trust. PageRank uses hyperlinks by humans. Google does not have the ability to hand boost any site, or hand boost any pagerank. They can penalize sites if they are spam, manually. Legal reasons and spam reasons for penalizing sites (also viruses). They try not to differentiate large sites versus small sites, they remove both. Our goal is to return the most relevant results.

Q: Microsoft introduces Smart Tags and it was a loud outcry. Google came out with AutoLink which is essentially the same. A: He brings up an example of how it is useful. They did not want to do Smart Tags, but it was not perceived from the public as that. So it backfired, in a sense. He gave examples of had to make it better.

Q: What is the day like you at the plex? Has it changes? A: A typical day is that he goes on thinking he will work on something and always works on something else. Either there is a fire or something new comes out and he needs to look into it. Since August of last year, he still goes in and works with top notch people. He still works with nice people, but the perception has changed from the outside. People think Google is going to be the next Microsoft. Its almost like they want Google to become less personal. So what can they do? They give more products, i.e. Google Analytics.

Q: When are you going to let Larry and Sergy out of their box? A: They are still working hard. He will pass it on.

Q: Google is in the process of building the largest data storage out there. Where do you see all this going? A: Matt wouldnt work at a company that he feels would use the data to abuse users or their trust. If you talk to the chief data officer at Yahoo, they collect 10 terabytes of data every day. Google knows a lot less about the specific user then Yahoo or MSN. Google does its very best to protect user privacy. He says the broad mission statement. If you want to take relevancy to the next level you need to know more about the user, not at the specific user level but on a more general level. They want to return the most relevant results, period. The nice thing is, if you have people sign in, you can give more personalized results (i.e. remove result).

Q: New features; gmail, maps, etc. didnt all work with all alternative browsers? Has there been a change of Google policy on that? A: Matt doesn't know. Matt uses ancient versions of Netscape which helps him spot more spam and CSS. You want to support every platform as much as you can.

Q: Google launching Google base, what is it all about? A: Its a searchable data store. You can specify fields in this data source and search them. You should be able to upload any data you want to make it searchable (like recipes and so on). You can upload via RSS, CSV, etc.

Q: There is an embargo being releases soon, can you spill the beans? A: He said come to the Smack Down session, its something for the Webmaster. He said he wants to make it easier for SEOs and harder for spammers.

Audience Questions: Q: Aging delay? Is there? A: Its like the sandbox Q. Just because a patent application is released, it doesnt mean they are using it.

Q: CSS positioning? How does it affect ranking. A: Good question, I don't know. If your doing an include, it probably wont matter either way. In his mind, positioning text at top or bottom, is over rated. But try it.

Q: Do you use the toolbar to figure out what to crawl and how often? A: Nope. Its all pretty much based on PageRank.

UPDATE: Does the toolbar changes the priority of something to be crawled? No -- I messed up on this Q & A

Q: Can you talk about Google Analytics and costs with AdWords not using it? A: Matt is trying it out on his blog. It used to be Urchin software. They made it free. Its free until you get 5 million page views per month, then you need to sign up with AdWords but you do not have to spend money with AdWords. He is not sure if there are issues outside of the US.

Q: Google Analytics, can you confirm that Google will be using that data in the search engine? A: He cant confirm, but he can deny it. :) Matt as a Web spam team member, does not have access to this data. He wont even ask for it. If it becomes a concern, he will post it on his blog. People will always be concerned, so don't use it.

Q: Do you guys feel affiliate sites with good content is spam? A: He said that they think of spam as what is the value add of this site. He explained how some sites make unique tools that make a value add. Just slapping up content from a feed, doesnt do it. Reviews, etc. need it.

Q: How do you think going public change Google. And how has the quadrupling of the stock changed Matt's next worth? A: He said it has quadrupled his net worth. :) There are people who had fun and who have left the company. But not many. Now, whenever he finds a book he likes, he buys it at amazon, he doesn't think about it. His day to day life hasnt changes much. But as Google as a whole, he doesn't think it hurt Google as a whole.

Q: Let's go back to text links. A: Best links are earned, not sold or traded. You may not get what you pay for. He said, if someone is selling text links, they should give you a free test trial to make sure it works. They have both manual and algorithmic approaches to detect paid links. He said Google.com gets emails asking to trade links. The guy who came up with the pixel homepage thing, that was creative.

Previous story: IAB Mexico Search Marketing Committee had 1st Official Meeting
 

Comments:

Erik

11/16/2005 07:25 pm

Thanks Barry - great summary for those of us who couldn't make it.

Ben Pfeiffer

11/16/2005 08:38 pm

Man I wish I was at this session. Great coverage

Angel Serrano

11/17/2005 12:53 am

How do you know the type of text links ? A free link , a paid link or an interchange link seem very similar ... We have got that get paid links are punished by google then Google adwords/adsense have to be penalized by google too ... A paid text link with relation with your topic affect search engines results but a free relationed link too. How do you difference the type of the link ? : position (home page, subpage, sitewide, article , directory, etc ) or the text aroun ( paid links, interesting links, etc) ... this stuff can be manipulated ... google then purchase people to evaluate search engine results and they will discover for example a site with five text links ...How hell they know what type of link is it ?... I can go to a blog and I can see a link in all pages then it is a paid link ... I think that it is not correct ... I can go to a home page and view for example 8 links ... they are paid or interchange or gratuitous links ... i dont know : a text link is a text link ... People who buy press release and then they get a text links ... this people get penalized ... People who pay to be listen in download between 80 or 300$ and then they get a link ... they will be penalized too ... People who write and published articles for a link in return ... this people will be penalizaed too .. There are people who pay to talk in forums and then they put a link in the firms ... this links will be penalized too ... What solution offer google to promote a site? : write content ... but if you begin with your site then all the filters attack to you (sand box, age filter, etc )... nobody will see your content and nobody will link to you ... Regards Angel S.P.

Angel Serrano

11/17/2005 09:48 pm

It is posible that time factor will be useful to know an interchange links but it can be modified too ... I can put a link now in a site and in two months I put a text link in other to close the link interchange ...

Aaron Pratt

11/19/2005 08:51 pm

I am no expert but logically there should be nothing wrong with paying for a link to get a site not yet found in google some good holiday traffic yes? If I was a search engine guru I would make it a relationship thing. If the "paid link" is not related to the website it is pointing to it will not get a good strong pr result, BUT will still get the traffic which new sites so desperately need. Oh well, that is my perfect world, I am going to pay for my first link soon (wish me luck), it's a scary pr8 but as you guys know most sites with high pr have great traffic. If I was a search engine guru I also would require a manual check for high pr/low pr increases, hope Google doesn't destroy me if this is how it works and they are not cool with paying for traffic. There is no reason why I shouldn't try to get some visits...grrr.

Shawn Hogan

12/13/2005 01:50 am

Regarding Google Analytics: "Matt as a Web spam team member, does not have access to this data. He wont even ask for it. If it becomes a concern, he will post it on his blog. People will always be concerned, so don't use it." Maybe the web search team doesn't have access to the data directly, but they certainly would be able to cross-reference all your sites setup to use it because you put your Google Analytics account info on all pages of all sites using it. For example "UA-xxxxx-y", where xxxxx is your unique account number and y is the site number. Personally I don't care, but for those that have something to hide, tagging all your pages gives any spider the ability to cross reference you to all the sites you own (or at least track under a single Google Analytics account). So they really don't NEED access to the underlying data (at least for purposes of cross-referencing).

pauline

01/29/2006 03:19 pm

why won't google let me into gmail i have signed up again thinking that would solve my problem now i have to gmails and still can't get into them Pauline.vannes@bigpond.com

No Name

03/31/2008 05:11 pm

if you are using nofollow tag for your outgoing links, i dont think google will have problem with that. if you are passing PR,then google may ban you. Adsense ads are nofollow, they just pass traffic not PR. We can not break the rule created by google, because google governs the internet

Dinesh Choudhry

10/26/2008 07:49 pm

Still, the sandbox controversy is up at its peek. Nobody, except Google knows what exactly it is and why websites are placed in Sandbox.

No Name

03/31/2009 07:32 pm

It's not a SandBox, it's just another parameter (Maturity Period) in the pagerank algorithm

Michael

05/25/2009 07:10 am

Hi, I just bumped into this. Is is amazing how good stuff can be invisible for so long. Anyway it is the first time that I have read about what goes on at Google. Getting inside the head of Google is fascinating and very important for SEO. At lot of the time we are simply guessing. Thanks for the article.

Richard

08/04/2009 10:12 pm

Very cool interview, but very, very bad written!

Ric

08/21/2010 04:26 pm

My new site was in the sandbox for 3 weeks. But it already got on the top position after some friendly SEO I have done.

blog comments powered by Disqus