The keywords and are used interchangeably. Some of the similar services to Social media newsfeeds are the following: newsfeed timeline Facebook newsfeed Twitter Timeline Instagram feed Google podcast feed Google news timeline Etsy feed Feedly Reddit feed Medium feed Quora feed Requirements The user newsfeed must be generated in near real-time based on the feed activity from the people that a user follows The feed items contain text and media files (images, videos) Data storage Database schema The primary entities of the database are the table, the table, and the table Users FeedItems Follows The relationship between the and the tables is Users FeedItems 1-to-many The relationship between the and the tables is Users Follows many-to-many The is a join table to represent the relationship (follower-followee) between the users Follows Type of data store The media files (images, videos) are stored in a managed such as AWS object storage S3 A SQL database such as Postgres stores the metadata of the user (followers, personal data) A NoSQL data store such as Cassandra stores the user timeline A cache server such as Redis stores the pre-generated timeline of a user High-level design The server stores the feed items in cache servers and the NoSQL store The newsfeed generated is stored on the cache server There is no feed publishing for inactive users but uses a pull model (fanout-on-load) The feed publishing for active non-celebrity users is based on a push model (fanout-on-write) The feed publishing for celebrity users is based on a hybrid push-pull model The client fetches the newsfeed from the cache servers Write path The client creates an HTTP connection to the load balancer to create a feed item The load balancer delegates the client connection to a web server with free capacity The write requests to create feed items are rate limited The feed item is stored on the message queue for asynchronous processing and the client receives an immediate response The fanout service distributes the feed item to multiple services to generate the newsfeed for followers of the client The object store persists the video or image files embedded in the feed item The NoSQL store persists the timeline of users (feed items in reverse chronological order) The SQL database stores the metadata of the users (user relationships) and the feed items A limited number of feed items for users with a certain threshold of followers are stored on the cache server The IDs of feed items are stored on the user timeline cache server for deduplication The feed generation service subscribes to the fanout service for any updates The feed generation service queries the in-memory user info service to identify the followers of a user and the category of a user (active non-celebrity users, inactive, celebrity users) The feed generation service creates the home timeline for active non-celebrity users using a push model (fanout-on-write) in linear time O(n), where n is the number of followers The feed items are ranked, sorted, and merged to generate the home timeline for a user The home timeline for active users is stored on the cache server for quick lookups There is no feed publishing for inactive users but uses a pull model (fanout-on-load) The feed publishing for celebrity users is based on a hybrid push-pull model (merge celebrity feed items to the home timeline of a user on demand) As an alternative, the feed publishing for celebrity users can use a push model only for the online followers in batches (not optimal solution) Read path The client executes a DNS query to resolve the domain name The client queries the CDN to check if the feed items for the home timeline are cached on the CDN The client creates an HTTP connection to the load balancer The load balancer delegates the client connection to a web server with free capacity The read requests to fetch the newsfeed are rate limited The web server queries the timeline service to fetch the newsfeed The timeline service queries the user info service to get the list of followee and identify the category of the user (active, inactive, following celebrities) The home timeline cache is queried to fetch the list of feed item IDs The feed items are fetched from the feed items cache server by executing an MGET operation on Redis When the client executes a request to fetch the timeline of another user, the timeline service queries the user timeline cache server The SQL database follower (replica) is queried on a cache miss The media files embedded on feed items are fetched from the object store The NoSQL data store is queried to fetch the user timeline on a cache miss The inactive users fetch the home timeline using a pull model (fanout-on-load) The active users following celebrity users use a hybrid model to fetch the home timeline (the feed items from celebrities are merged on demand) References Raffi Krikorian, , infoq.com Timelines at Scale , facebook.com How feed works Featured image . source Also published . here

Designing a Social Media News Feed System

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Designing an API Rate Limiter

How Cameo Has Come to Symbolize the Uncanny Valley of Modern Stardom

The Noonification: Pick Your Billionaire (11/12/2022)

The 💩 Poop Emoji's 10th Anniversary

101 Small Business Marketing Ideas

101 Stories To Learn About Influencer Marketing

Designing an API Rate Limiter

How Cameo Has Come to Symbolize the Uncanny Valley of Modern Stardom

The Noonification: Pick Your Billionaire (11/12/2022)

The 💩 Poop Emoji's 10th Anniversary

101 Small Business Marketing Ideas

101 Stories To Learn About Influencer Marketing

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps