Google Awarded Patent for Duplicate Content "Similarity Estimation"

Jan 3, 2007 - 7:57 am 0 by

Another duplicate content thread with a different perspective at WebmasterWorld. Google has been awarded a patent for a 2001 submission of an application named Methods and apparatus for estimating similarity. The patent is a relatively short read compared to others, here is the abstract:

A similarity engine generates compact representations of objects called sketches. Sketches of different objects can be compared to determine the similarity between the two objects. The sketch for an object may be generated by creating a vector corresponding to the object, where each coordinate of the vector is associated with a corresponding weight. The weight associated with each coordinate in the vector is multiplied by a predetermined hashing vector to generate a product vector, and the product vectors are summed. The similarity engine may then generate a compact representation of the object based on the summed product vector.

It is old but can be a fun read for some.

Forum discussion at WebmasterWorld.

 

Popular Categories

The Pulse of the search community

Search Video Recaps

 
- YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: December 26, 2025

Dec 26, 2025 - 10:00 am
Search Video Recaps

Search News Buzz Video Recap: Google Core Update Status, News Publishers Traffic Distribution, Ads In AI Overviews Expand, ChatGPT Ads & Christmas

Dec 26, 2025 - 8:01 am
Other Search Engines

ChatGPT Ads May Prioritize Sponsored Content In AI Responses

Dec 26, 2025 - 7:51 am
Google Ads

Google: #1 Google Ads Launches Of 2025

Dec 26, 2025 - 7:41 am
Google

Google Continues Centering Search Results Test

Dec 26, 2025 - 7:31 am
Google Search Engine Optimization

Google: Pick A Reasonable Site Name To Rank For In Search

Dec 26, 2025 - 7:21 am
 
Previous Story: Google AdSense Publishers Shares His Guide to Success in 2006