Twitter Revamps Search Engine Backend

Oct 7, 2010 - 7:50 am 3 by
Filed Under Social Search

Twitter announced they have "launched a new backend for search on twitter.com." In short, they moved from the original Summarize technology they bought years ago to a infrastructure and system that is completely new, home grown.

Tedster at WebmasterWorld pulls out the key differences:

  • Twitter's real-time search engine was, until very recently, based on the technology that Summize originally developed.
  • [Now we have] a new, modern search architecture based on a highly efficient inverted index instead of a relational database.
  • With over 1,000 TPS (Tweets/sec) and 12,000 QPS (queries/sec) = over 1 billion queries per day (!) we already put a very high load on our machines.
  • We estimate that we're only using about 5% of the available backend resources, which means we have a lot of headroom. Our new indexer could also index roughly 50 times more Tweets per second than we currently get!

Regarding the 1 billion queries per day, they are not human searches. I strongly recommend you read Danny's piece on that.

Twitter said they chose Lucene, a search engine library written in Java, as a starting point. But not without modifications, things Twitter changed include significantly improved garbage collection performance, lock-free data structures and algorithms, posting lists, that are traversable in reverse order and efficient early query termination.

Forum discussion at WebmasterWorld.

 

Popular Categories

The Pulse of the search community

Follow

Search Video Recaps

 
Video Details More Videos Subscribe to Videos

Most Recent Articles

Google Updates

Google March 2024 Core Update Finished April 19th (A Week Ago)

Apr 26, 2024 - 4:40 pm
Search Forum Recap

Daily Search Forum Recap: April 26, 2024

Apr 26, 2024 - 4:00 pm
Search Video Recaps

Search News Buzz Video Recap: Google Core Update Updates, Site Reputation Abuse Coming, Links, Ads & More

Apr 26, 2024 - 8:01 am
Google Search Engine Optimization

Google Publisher Center No Longer Allows Adding Publications

Apr 26, 2024 - 7:51 am
Google

Google Tests Placing The Snippet Date Next To URL

Apr 26, 2024 - 7:41 am
Google

Google Breaks Out Googlebot IP Ranges For User-Triggered Fetchers

Apr 26, 2024 - 7:31 am
Previous Story: More Google Properties Get Instant