Published on
·5 min read

Cache Made Consistent: Meta's Cache Invalidation Solution

Authors

Original Article

The Problem Everyone Ignores

"There are only two hard things in Computer Science: cache invalidation and naming things." This quote gets laughed at in interviews but represents a real problem at Meta's scale.

Meta runs one of the largest caching infrastructures in the world. TAO (their social graph cache) handles billions of reads per second. When a user updates their profile or posts content, every cached copy of that data must be invalidated. Miss one cache and users see stale data.

The naive approach is to invalidate all caches when data changes. But with thousands of cache servers across multiple regions, invalidation messages can be delayed, lost, or arrive out of order. A cache might serve stale data for minutes or hours.

Consistency Model

Meta wanted a consistency model where cache reads always return data at least as fresh as the most recent write the client has observed. This is called "read-your-writes" consistency.

Formally, if a client performs write WW at time twt_w and later reads at time tr>twt_r > t_w, the read must return either:

  • The result of WW, or
  • The result of a write that happened after WW

This seems obvious but is surprisingly hard to guarantee when:

  • Caches are distributed across continents
  • Database replication is asynchronous
  • Network partitions happen regularly

Version Tracking

Polaris assigns a monotonically increasing version number to each cached object. When data is written to the database, it gets a version vv. Any cache holding version v<vv' < v is stale.

The version is a logical timestamp derived from the database's write-ahead log (WAL) position. Every write to MySQL increments the log sequence number (LSN). Polaris uses this LSN as the version.

vcache<vdatabase    cache is stalev_{\text{cache}} < v_{\text{database}} \implies \text{cache is stale}

Invalidation Pipeline

When a write occurs:

  1. The database records the write and its version vv
  2. An invalidation event is published: (key, vv)
  3. Polaris delivers the event to all cache regions
  4. Each cache server compares vv with its local version
  5. If local version is less than vv, the entry is invalidated

The challenge is ensuring events are delivered reliably. Polaris uses a combination of persistent queues and polling. Caches that miss an invalidation event will eventually discover staleness through periodic version checks.

Handling Out-of-Order Delivery

Network delays can cause invalidation events to arrive out of order. Consider:

  1. Write W1W_1 with version 100 happens
  2. Write W2W_2 with version 101 happens
  3. Cache receives invalidation for version 101
  4. Cache receives invalidation for version 100 (delayed)

If the cache blindly processes events in arrival order, the second event might incorrectly invalidate data that was already updated by the first event.

Polaris solves this by comparing versions before invalidating. An invalidation for version vv only takes effect if the cache's current version is less than vv. The delayed event for version 100 is a no-op because the cache already knows about version 101.

invalidate(k,v)={evict(k)if vcache(k)<vno-opotherwise\text{invalidate}(k, v) = \begin{cases} \text{evict}(k) & \text{if } v_{\text{cache}}(k) < v \\ \text{no-op} & \text{otherwise} \end{cases}

Cross-Region Consistency

Meta's infrastructure spans multiple geographic regions. A user in Europe might write data that is read by a user in Asia. The European write must propagate to Asian caches.

Database replication is asynchronous. The Asian replica might lag behind the European primary by several seconds. If the Asian cache queries its local replica, it might get stale data and cache it.

Polaris tracks replication lag per region. When serving a read, the system checks if the local replica has caught up to the version the client expects. If not, the request is routed to a region with fresher data.

Cache Stampede Prevention

When a popular cache entry is invalidated, many concurrent requests might try to repopulate it simultaneously. Each request queries the database, wasting resources.

Polaris uses a lease mechanism. The first request to find an empty cache entry acquires a lease and is responsible for fetching from the database. Subsequent requests wait for the lease holder to populate the cache.

The lease has a timeout. If the lease holder crashes or is slow, another request can acquire the lease after timeout. This prevents deadlock while minimizing database load.

Operational Lessons

Meta found that most consistency bugs came from edge cases:

  • Cache servers restarting with cold caches
  • Network partitions isolating a cache from invalidation events
  • Clock skew between regions causing version comparisons to fail

Polaris includes monitoring that detects when a cache's version falls too far behind. Alerts fire before users notice staleness. The system can automatically warm cold caches by pre-fetching popular entries.

The paper emphasizes that cache consistency is a systems problem, not an algorithm problem. The algorithms are straightforward. The difficulty is making them work reliably across thousands of servers with varying failure modes.