Research Paper On Bing Determining Web Page Credibility

Mar 22, 2011 • 9:10 am | comments (1) by | Filed Under Bing SEO

Bing Credibility - PinocchioA WebmasterWorld thread cites a new research paper published by Microsoft Research named Augmenting Web Pages & Search Results For Improved Credibility (PDF).

In short, how does or could Bing determine the credibility of a web page based on the content and signals surrounding a specific web page. And if they can determine the credibility of a web page, how can they use that for search engine rankings in Bing search results?

Here is the abstract:

The presence (and, sometimes, prominence) of incorrect and misleading content on the Web can have serious conse-quences for people who increasingly rely on the internet as their information source for topics such as health, politics, and financial advice. In this paper, we identify and collect several page features (such as popularity among specialized user groups) that are currently difficult or impossible for end-users to assess, yet provide valuable signals regarding credibility. We then present visualizations designed to augment search results and Web pages with the most prom-ising of these features. Our lab evaluation finds that our augmented search results are particularly effective at in-creasing the accuracy of users' credibility assessments, highlighting the potential of data aggregation and simple interventions to help people make more informed decisions as they search for information online.

I was going to go through and pull out the metrics Microsoft covered in the paper, but I do not have to do that. Bill Slawski already did that at his blog, SEO By The Sea. He summarized:

Here are some of the "credibilty" signals that that they looked at:

On-Page Features

  • Spelling Errors
  • Number of Advertisements on a page
  • Domain Type (.com, .gov. etc.)
Off-Page Features
  • Awards and Certifications, such as the Webby Award, Alexa Rank, Health on the Net (HON) awards.
  • Toolbar PageRank, and Rankings for Queries used in generating their data set
  • Sharing information, from sites like and other shortening sites, Likes and shares and comments and clicks from Facebook, Clicks from shortened URLs on Twitter, bookmarks on Delicious.
User Aggregated Non-Public Data from Toolbar Usage
  • General Popularity – unique visitors from users
  • Geographic Reach – number of visitors from different geographic regions
  • Dwell Time – amount of time users kept a URL open in their browser (as an estimate of how long they might have viewed a page
  • Revisitation patterns – how often people revisited a page, on average
  • Expert Popularity – the behavior of people who have been shown to have an expertise in a pariticular field, and user data about their visits to pages in those fields.

Very nicely done and Bill goes through it even more deeply at his blog.

Forum discussion at WebmasterWorld.

Previous story: English Data Only For Google's Rich Snippet Testing Tool
Ninja Banner
blog comments powered by Disqus