Designing a News Feed System

A system design interview guide to building the home feed of a social application: how a post travels from one author to the screens of millions of followers, and how the right blend of push and pull keeps both writes and reads fast.

A news feed is the continuously updating list of posts a user sees when they open a social application — status updates, photos, links, and so on, drawn from the accounts they follow. It looks simple from the outside, but it hides a sharp tension. The system must serve feeds to active users with very low latency, yet a single popular post may need to reach millions of followers. The whole design comes down to deciding when the work of assembling a feed happens: eagerly at the moment of posting, lazily at the moment of reading, or some combination of the two. This guide walks through the two core operations, the fanout models that resolve that tension, the data flow that ties them together, and the caching layers that keep the common path fast.

Contents

  1. Two Core Operations
  2. System Architecture
  3. Fanout Models
  4. The Publish & Fanout Flow
  5. Caching Layers
  6. The Celebrity Problem
  7. Summary

1. Two Core Operations

Before any architecture, it helps to separate the feed into the two distinct things it actually does. They have opposite performance profiles, and almost every design decision is a tradeoff between them.

OperationWhat happensWho triggers it
Feed publishingA user creates a post. The system stores it and then has to make it visible in the feeds of everyone who follows that user.The author, once per post.
Feed building / retrievalA user opens the app. The system has to assemble an ordered list of recent posts from the people that user follows and return it.Every reader, every time they open or refresh.

The asymmetry is the key insight. Publishing is comparatively rare — a person posts a handful of times a day at most. Retrieval is constant — the same person may open and refresh their feed dozens of times a day, and there are far more readers than authors active at any instant. A naive feed reads, for every refresh, the recent posts of every account the user follows, sorts them, and returns the top slice. That is correct but expensive on the hot path. The art of feed design is moving as much work as possible off the read path without paying for work nobody will ever look at.

A useful framing: feed building is a join between "who I follow" and "their recent posts," ordered by time or relevance. The question is whether you precompute that join (do the work at write time) or compute it on demand (do it at read time).

2. System Architecture

At a high level the system is split into a request-handling tier, a write path for storing posts, and a separate fanout path that distributes those posts into per-user feeds. A client request first resolves through DNS to a load balancer, which spreads traffic across a pool of stateless web servers. Those servers handle cross-cutting concerns — authentication to identify the caller and rate limiting to protect downstream services — before forwarding to the appropriate internal service.

News feed system architecture
A client request flows through DNS, a load balancer, and authenticating, rate-limiting web servers. Publishing writes to the Post Service (cache + DB) and triggers the Fanout Service, which reads the social graph and user data, queues work, and lets fanout workers populate each follower's news feed cache. A notification service runs alongside.

From there the request takes one of two paths:

A Notification service runs alongside the fanout path, so that delivering a post can also trigger a push notification or badge update where appropriate. Decoupling fanout behind a queue is what keeps the author's publish call fast: the post is acknowledged as soon as it is durably stored, and the potentially huge job of distributing it happens asynchronously.

3. Fanout Models

"Fanout" is the act of delivering one post to many follower feeds. There are two pure strategies, and the choice between them is the single most important decision in the whole design. Each optimizes one of the two core operations at the expense of the other.

ModelHow it worksWritesReadsWeakness
Fanout on write (push)When a user posts, immediately write the post id into the feed cache of every follower. Each follower's feed is a precomputed list, ready to serve.Heavy — one write per follower, per post.Very fast — the feed is already assembled; a read is a single cache lookup.Wasteful for inactive followers who never read; a hotkey explosion of writes when an author has millions of followers.
Fanout on read (pull)Write only the post itself. Build each feed lazily at read time by pulling the recent posts of everyone the reader follows and merging them.Light — one write, regardless of follower count.Heavy — every refresh fans out across all followed accounts, fetches and merges.Slow, expensive reads on the hot path; cost scales with how many accounts each user follows.
Hybrid (recommended)Use push for ordinary users, and pull for celebrities and accounts with very large follower counts. A reader's feed merges their precomputed push feed with freshly pulled celebrity posts.Bounded — push for the many, no explosive fanout for the few.Fast — mostly a precomputed lookup, plus a small pull for a handful of followed celebrities.More moving parts; needs a rule to classify accounts as "celebrity."

Fanout on write makes reads trivial: when a follower opens the app, their feed already exists as a list in cache, so retrieval is a single lookup. The price is paid at write time. Posting once may fan out to every follower, which is fine for someone with a few hundred followers but catastrophic for someone with tens of millions — a single post becomes tens of millions of cache writes. It also does work for followers who may not log in for weeks, spending write capacity on feeds nobody reads.

Fanout on read inverts the tradeoff. Publishing is cheap because you store the post and stop; no fanout happens. The cost moves entirely to the read path, where every feed open must gather the recent posts from each followed account and merge them on the fly. This naturally handles celebrities — a popular author causes no write storm — but it makes the most frequent operation, reading, the most expensive one.

The hybrid model is the recommended approach because it lets each kind of account use the strategy that suits it. The overwhelming majority of users have modest follower counts, so pushing their posts is cheap and gives their followers instant reads. The small number of accounts with enormous follower counts are exactly the ones that would cause a write explosion, so their posts are pulled instead. At read time, a user's feed is the precomputed push feed merged with a small live pull of the handful of celebrities they follow.

4. The Publish & Fanout Flow

Tracing a single post through the system makes the fanout path concrete. Once the Post Service has durably stored a post and the Fanout Service picks it up, the numbered steps from the architecture diagram are:

  1. Get friend ids from the graph DB. The Fanout Service asks the social graph for the list of accounts that follow the author. This list is the set of feeds the post must reach.
  2. Get friend data from the user cache / DB. For each follower the service may need lightweight user data — for example to skip inactive accounts, respect privacy or block settings, or apply per-user ranking signals. This comes from the user cache, falling back to the user DB on a miss.
  3. Push to a message queue. Rather than writing millions of feed entries inline, the service enqueues distribution work — typically batches of (post id, follower id) deliveries. The queue absorbs bursts and decouples the author's request from the heavy work.
  4. Fanout workers consume. A horizontally scalable pool of workers drains the queue. Because the work is in a queue, the system can add workers to drain a backlog without touching the publish path, and a slow spell simply means the queue grows rather than the author's request stalling.
  5. Write into the news feed cache. Each worker writes the post reference into the target follower's news feed cache, prepending it so the most recent posts sit at the top of the list. After this step, that follower's next feed open is a fast cache read.
# publish: fast, synchronous part
function publish(author_id, content):
  post = post_service.store(content)        # Post DB + Post cache
  fanout_service.enqueue_fanout(post)       # hand off, return now
  return ack(post.id)                       # author's call is done

# fanout: heavy, asynchronous part
function enqueue_fanout(post):
  followers = graph_db.get_followers(post.author_id)   # step 1
  for batch in chunk(followers, size=1000):
    enrich(batch, user_cache)                          # step 2
    message_queue.push(post.id, batch)                 # step 3

function fanout_worker():                               # step 4
  while True:
    post_id, batch = message_queue.pop()
    for follower_id in batch:
      news_feed_cache.prepend(follower_id, post_id)     # step 5
Celebrity posts are the exception to this flow. Under the hybrid model an account above a follower threshold is not fanned out at all; its posts stay in the post store and are pulled in at read time, sidestepping steps 1 through 5 entirely for that author.

5. Caching Layers

Caching is what makes the read path cheap, and the important design choice is what to store where. The two caches play different roles.

CacheWhat it storesWhy
News Feed cacheAn ordered list of ⟨post id, user id⟩ pairs per follower — references, not full posts.Keeps each feed list small and cheap to write during fanout. A million followers means a million tiny list updates, not a million copies of the content.
Post cacheThe full content of recent posts, keyed by post id.Lets the read path turn a list of ids into displayable posts with fast lookups, without hitting the Post DB for hot content.

The reason the news feed cache stores only id pairs rather than whole posts is both space and consistency. Storing references means a post exists in exactly one place — the post store — even though it appears in millions of feeds. If the post is edited or deleted, the single canonical copy changes and every feed reflects it on the next read; there is no fan of stale duplicates to chase down. The cost is one extra step at read time: after reading the list of references from the news feed cache, the system hydrates them by fetching the full content for each id from the post cache (falling back to the post DB on a miss).

function get_feed(user_id, limit):
  refs = news_feed_cache.get(user_id, limit)   # ⟨post id, user id⟩ list
  if hybrid_enabled(user_id):
    refs = merge(refs, pull_celebrity_posts(user_id))  # hybrid read
  posts = []
  for ref in refs:
    post = post_cache.get(ref.post_id) \
        or post_db.get(ref.post_id)            # hydrate full content
    posts.append(post)
  return posts                                 # ordered, ready to render

6. The Celebrity Problem

The hotkey or celebrity problem is the failure mode that breaks a pure push design. In fanout on write, the cost of a post is proportional to the author's follower count. For most accounts that is a non-issue. But an account with tens of millions of followers turns one post into tens of millions of cache writes, all triggered by a single action. Worse, when several such accounts post around the same time — say during a live event — the fanout system is flooded, the message queue backs up, and ordinary users' posts get stuck behind the celebrity write storm. Delivery latency spikes for everyone.

The hybrid model addresses this directly by not pushing celebrity posts at all. An account whose follower count crosses a threshold is flagged, and its posts are excluded from the fanout path. Instead, those posts are fetched at read time: when a user opens their feed, the system reads their precomputed push feed and then pulls the latest posts from the small set of celebrities that particular user follows, merging the two into the final ordered list.

This works because the numbers fall on the right side of each tradeoff. There are very few celebrity accounts, so the extra pull at read time is bounded — a user follows only a handful of them, not millions. And those few accounts are precisely the ones whose push would be ruinously expensive, so removing them from fanout eliminates the write storm. Ordinary accounts, which are cheap to push and benefit most from instant reads, keep the push path. Each account ends up on the strategy that fits it, which is the whole point of the hybrid approach.

The threshold itself is a tuning knob, not a fixed law. Set it too low and too many accounts are pulled, making reads heavier than they need to be; set it too high and you risk write storms from accounts just under the line. It is chosen — and adjusted — by measuring real fanout cost and read latency.

7. Summary

A news feed design is a series of decisions about moving work between the write path and the read path:

ConcernApproach
What are the two operations?Feed publishing (an author posts) and feed building/retrieval (a reader opens their feed) — rare heavy-distribution work versus constant read work.
How does a post become durable?The Post Service stores it in the Post DB and warms the Post cache before any distribution happens.
How does a post reach followers?Fanout: push (write into every follower's feed at post time), pull (build the feed at read time), or a hybrid of both.
Which fanout model to choose?Hybrid — push for ordinary users (fast reads), pull for celebrities (no write storm).
How is fanout kept off the publish path?The Fanout Service reads the graph, enriches with user data, and queues work for fanout workers that write into the news feed cache asynchronously.
What lives in the feed cache?⟨post id, user id⟩ references, not full posts; content is hydrated from the post cache/DB at read time.
How is the celebrity hotkey solved?Skip fanout for very-high-follower accounts and pull their posts in at read time, merging them into the precomputed feed.
The recurring theme: there is no single right place to do the work. Publishing and reading pull in opposite directions, and a good feed lets each account — ordinary or celebrity — use the strategy that keeps both the write storm and the read latency under control.