System Design Study: Twitter's Recommendation Algorithm
This article presents a gist of how twitter's recommendation algorithm is designed
In 2023, Twitter published a blog post explaining how Twitter calculates a user’s timeline. This article will be a gist of that blog explaining all the necessary information. It will help you understand how social media timelines are designed. You can take this blog as a reference if you are working on a similar product or can use it for your system design interviews.
The recommendation pipeline consists of four main stages:
Candidate Sourcing: Find the best tweets for showing to end user
Ranking: Rank all the tweets in the best order personalized for end users.
Heuristics and Filtering: Apply different kinds of heuristics and filtering to prepare a well-balanced feed that is personalized for the end user.
Mixing and Serving: Mix the tweets with other non-tweet content and then display them to the end user
Let’s take a deep dive into each of these points to understand better.
1. Candidate Sourcing
Twitter filters out the best 1500 tweets from a pool of hundreds of millions of tweets for the user’s timeline. These tweets consist of a 50:50 ratio of In-Network Tweets and Out-of-Network tweets. In-network tweets consist of those tweets from people which you follow and Out-of-Network tweets consist of those tweets from people which you don’t follow. Let’s discuss one specific problem related to each of them.
In-network tweets: The main problem is how to rank all in-network tweets so that the timeline is most relevant to you.
Twitter ranks all the in-network tweets using a machine-learning model called Real-Graph which predicts the likelihood of engagement between two users. The higher the Real Graph score between you and the author of the Tweet, the more of their tweets Twitter will include.
Out-of-network tweets: The main problem is how to tell if a certain tweet is relevant to you if you don’t follow the author.
To be honest, this is a trickier problem and involves a lot of prediction-based algorithms. Twitter solves the problem by using two approaches: 1) Social Graph: In this approach, Twitter developed GraphJet which is a graph processing engine that maintains a real-time interaction graph between users and Tweets. Twitter’s algorithm traverses this graph of engagements to find out what tweets the people you follow recently engaged with. 2) Embedding spaces: This approach is a more generic one that calculates content similarity between other people's tweets and your interests.
Overall, a blend of In-Network tweets and out-of-network tweets is prepared and the best 1500 tweets are collected.
2. Ranking
At this point, the candidate sourcing is completed and now the 1500 tweets are treated equally. These tweets are now ranked using a ~48M parameter neural network that is continuously trained on real-time tweet interactions to optimize for positive engagement which means more likes, comments, and retweets.
After doing all the required processing, the ranking engine outputs ten labels for each tweet, where each label represents the probability of an engagement.
As per the source code from Twitter’s Github, the ten labels produced are:
scored_tweets_model_weight_fav: The probability the user will favorite the Tweet.
scored_tweets_model_weight_retweet: The probability the user will Retweet the Tweet.
scored_tweets_model_weight_reply: The probability the user replies to the Tweet.
scored_tweets_model_weight_good_profile_click: The probability the user opens the Tweet author profile and Likes or replies to a Tweet.
scored_tweets_model_weight_video_playback50: The probability (for a video Tweet) that the user will watch at least half of the video.
scored_tweets_model_weight_reply_engaged_by_author: The probability the user replies to the Tweet and this reply is engaged by the Tweet author.
scored_tweets_model_weight_good_click: The probability the user will click into the conversation of this Tweet and reply or Like a Tweet.
scored_tweets_model_weight_good_click_v2: The probability the user will click into the conversation of this Tweet and stay there for at least 2 minutes.
scored_tweets_model_weight_negative_feedback_v2: The probability the user will react negatively (requesting "show less often" on the Tweet or author, block or mute the Tweet author).
scored_tweets_model_weight_report: The probability the user will click Report Tweet.
And, there is a weight associated with each label:
scored_tweets_model_weight_fav: 0.5
scored_tweets_model_weight_retweet: 1.0
scored_tweets_model_weight_reply: 13.5
scored_tweets_model_weight_good_profile_click: 12.0
scored_tweets_model_weight_video_playback50: 0.005
scored_tweets_model_weight_reply_engaged_by_author: 75.0
scored_tweets_model_weight_good_click: 11.0
scored_tweets_model_weight_good_click_v2: 10.0
scored_tweets_model_weight_negative_feedback_v2: -74.0
scored_tweets_model_weight_report: -369.0
Then, a final score is calculated using the following formula:
score = sum_i { (weight of engagement i)*(probability of engagement i) }
The higher the overall score, the higher the rank of the tweet in the overall timeline.
3. Heuristics and Filtering
This stage consists of applying different kinds of heuristics and filters such as:
Filter tweets based on your preferences and do not show any tweets from users that you have blocked or muted.
Avoid too many consecutive tweets from a single author
Ensure a fair balance of in-network and out-of-network tweets
Lower the score of a particular tweet if the user has provided negative feedback on the same tweet.
Install quality safe checks such as including only those out-of-network tweets where people you follow engaged with the tweet or people you follow have followed the tweet’s author.
These are some examples that make sure a well-balanced feed is delivered to the end user and make the user experience of scrolling through their timeline as relevant as possible.
4. Mixing and Serving
As the last step in the whole process, the Home Mixer service (which is responsible for constructing and serving the For You timeline) mixes the tweets with non-tweet content like Ads, Follow Recommendations, and Onboarding prompts.
As per Twitter’s blog, this pipeline runs approximately 5 billion times per day and completes in under 1.5 seconds on average. This shows the massive amount of data processing it takes to construct a single-user timeline.
That’s it, folks for this edition of the newsletter. In future editions, I’ll try to cover the algorithms of other popular social media platforms to get a holistic idea of building a social media user timeline.
Please consider liking and sharing with your friends as it motivates me to bring you good content for free. If you think I am doing a decent job, share this article in a nice summary with your network. Connect with me on Linkedin or Twitter for more technical posts in the future!
Book exclusive 1:1 with me here.
Resources
Twitter’s Recommendation Algorithm
Twitter’s Algorithm Source Code