Moz Builds That Spam Identification Tool In Site Explorer

Mar 31, 2015 • 8:38 am | comments (25) by twitter Google+ | Filed Under Search Engine Optimization Tools
 

taekwondo fighterAlmost two years ago, Rand Fishkin from Moz posted about looking into building a spam detection tool as a way for sites to figure out (1) who not to get links from, (2) remove bad links from and (3) see how Google may determine if a site is spammy and why.

The industry was torn, thinking this may be just Rand's way of doing automated "outing" but the truth is, a tool like this can be useful to link forensics (I used that word) SEOs.

Rand announced the new paid tool on the Moz blog yesterday. Here is a quick video of how it works:

In short, it looks at just 17 different factors and if a site is flagged with some or many, it will score you as more and more spammy as more flags get hit.

Here are the flags Moz uses:

  • Low mozTrust to mozRank ratio: Sites with low mozTrust compared to mozRank are likely to be spam.
  • Large site with few links: Large sites with many pages tend to also have many links and large sites without a corresponding large number of links are likely to be spam.
  • Site link diversity is low: If a large percentage of links to a site are from a few domains it is likely to be spam.
  • Ratio of followed to nofollowed subdomains/domains (two separate flags): Sites with a large number of followed links relative to nofollowed are likely to be spam.
  • Small proportion of branded links (anchor text): Organically occurring links tend to contain a disproportionate amount of banded keywords. If a site does not have a lot of branded anchor text, it's a signal the links are not organic.
  • Thin content: If a site has a relatively small ratio of content to navigation chrome it's likely to be spam.
  • Site mark-up is abnormally small: Non-spam sites tend to invest in rich user experiences with CSS, Javascript and extensive mark-up. Accordingly, a large ratio of text to mark-up is a spam signal.
  • Large number of external links: A site with a large number of external links may look spammy.
  • Low number of internal links: Real sites tend to link heavily to themselves via internal navigation and a relative lack of internal links is a spam signal.
  • Anchor text-heavy page: Sites with a lot of anchor text are more likely to be spam then those with more content and less links.
  • External links in navigation: Spam sites may hide external links in the sidebar or footer.
  • No contact info: Real sites prominently display their social and other contact information.
  • Low number of pages found: A site with only one or a few pages is more likely to be spam than one with many pages.
  • TLD correlated with spam domains: Certain TLDs are more spammy than others (e.g. pw).
  • Domain name length: A long subdomain name like "bycheapviagra.freeshipping.onlinepharmacy.com" may indicate keyword stuffing.
  • Domain name contains numerals: domain names with numerals may be automatically generated and therefore spam.

What do you think? I have yet to play directly with the tool.

Forum discussion at Twitter.

Image credit to BigStockPhoto for taekwondo fighter

Previous story: Google News Updates Inclusion Process: Rejection Notices Bug
 
blog comments powered by Disqus