Microsoft Bing Patent On Web Site/Content Reliability Scores

Nov 7, 2023 - 7:31 am 2 by
Filed Under Bing Search

Bing Robot Score

Microsoft has published a new patent application named Web Content Reliability Classification (US 20230350956 A1). It seems this patent describes how to figure out a reliability score for a website or portion of the content on a website for use in Bing Search.

Of course, you need to keep in mind that just because a search company has a patent, it does not mean that patent is in use today or ever.

I am no patent writer, like the late great Bill Slawski, so I won't pretend to do that. But here is the abstract:

Technology described herein assigns a reliability score to web content, such as a web site or portion of a website. In one aspect, an output of the technology is a high reliability score and a low reliability score for a web content. The high reliability score represents conformance to high reliability sites, while the low reliability score represents conformance to low reliability sites. The high reliability score may be generated by first identifying high reliability online content within a compressed web graph. In a first iteration, the high reliability score of the seeds is used to score online content that is linked to the seed sites. At a high level, the more links that originate from high reliability sources, the higher the reliability score for the linked content. The low reliability score is similar, but uses outgoing links to low reliability sites instead of incoming links from high reliability sites.

Glenn Gabe spotted this and posted it on X:

What can this reliability score do? "The reliability score can be used to block content, rank content, provide a content warning, and select a source to answer a question, along with other uses."

How does it determine if something is reliable? Here are some quotes:

  • "Traffic data can indicate whether a source is popular, but popular is not the same thing as reliable.
  • Natural language processing can be used to determine whether online content is grammatical, but grammatical is also not the same thing as reliable.
  • The present technology identifies reliable content by leveraging expert scoring for a small amount of web content by iteratively extending these scores to other content based on how web content is linked.
  • User interactions may also be leveraged.

This patent also talks about "seed sites" used to help determine what is reliable. "The high reliability score is generated by first identifying high reliability online content within a web graph. These initially scored sites may be described as seed sites. Ratings for the seed sites may be taken from authoritative lists of known reliable content providers," it says.

That is just a touch of this patent, hope you enjoy reading through it.

Forum discussion at X.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Video Details More Videos Subscribe to Videos

Most Recent Articles

Google Updates

Google March 2024 Core Update Finished April 19th (A Week Ago)

Apr 26, 2024 - 4:40 pm
Search Forum Recap

Daily Search Forum Recap: April 26, 2024

Apr 26, 2024 - 4:00 pm
Search Video Recaps

Search News Buzz Video Recap: Google Core Update Updates, Site Reputation Abuse Coming, Links, Ads & More

Apr 26, 2024 - 8:01 am
Google Search Engine Optimization

Google Publisher Center No Longer Allows Adding Publications

Apr 26, 2024 - 7:51 am
Google

Google Tests Placing The Snippet Date Next To URL

Apr 26, 2024 - 7:41 am
Google

Google Breaks Out Googlebot IP Ranges For User-Triggered Fetchers

Apr 26, 2024 - 7:31 am
Previous Story: Microsoft: Bing Chat Demand Not Going Down