Last updated on

Caching

💡 Core Concepts of Caching

Caching is a strategy to store copies of data in a faster, typically smaller, storage layer closer to the consumer (CPU, client, or server process).

  • Goal: The primary goals are to lower latency (faster access) and increase throughput (handle more requests per second). It can also decrease network cost by reducing data sent over the network.
  • Trade-off: Caching often involves a trade-off where you sacrifice some consistency for speed, as the cached data might be slightly out-of-date or “stale.”
  • Examples:
    • Single Computer: CPU Cache (fastest, smallest) $\rightarrow$ RAM $\rightarrow$ Disk (slowest, largest).
    • Distributed Systems: In-memory store (like Redis) $\rightarrow$ Database (Disk storage).

💻 Client-Side Caching (Browser Caching)

The web browser uses caching to store static content (like JavaScript, CSS, or HTML files) on the user’s local disk, which is still much faster than a network request.

  • Static Content: The data must be content that is not expected to change frequently.
  • Mechanism: The server uses the Cache-Control HTTP header (e.g., max-age=3600) in the response to instruct the browser on how long (in seconds) it can cache the resource.
  • Performance: Reading from a local disk cache is typically in the order of milliseconds, whereas a full network request can take 0.1 seconds or more.

🎯 Cache Hits and Misses

This terminology is used across all levels of caching to describe the outcome of a request to the cache:

  • Cache Hit: The requested data is found in the cache and is not expired (stale). The request is served quickly from the cache.
  • Cache Miss: The requested data is not found in the cache, or it is found but has expired (is stale). A request must then be made to the slower, primary data store (e.g., disk or origin server).
  • Cache Ratio: The measure of caching efficiency. $$\text{Cache Ratio} = \frac{\text{Total Hits}}{\text{Total Hits} + \text{Total Misses}}$$ The goal is to maximize this ratio.

💾 Server-Side Caching Strategies

In a server environment (like a Twitter backend), caching typically uses a fast, in-memory Key-Value store (e.g., Redis) in front of a slower, persistent database (Disk). The choice of strategy depends on the application’s tolerance for latency and consistency.

1. Write Around

  • Write Operation: Skips the cache and writes directly to the primary storage (disk).
  • Read Operation: Cache Miss on the first read $\rightarrow$ Read from disk $\rightarrow$ Store in cache $\rightarrow$ Return to user.
  • Use Case: Only puts data in the cache if it’s actually being read. Useful when a small subset of data is popular, like popular tweets (avoiding caching millions of never-read tweets).

2. Write Through

  • Write Operation: Writes to the cache $\rightarrow$ Then synchronously writes to the primary storage.
  • Read Operation: Always finds the freshest data in the cache (Cache Hit).
  • Use Case: Simple, ensures data is always consistent and in the cache immediately, but the write latency is higher because it waits for the slower primary storage write.

3. Write Back

  • Write Operation: Writes only to the cache $\rightarrow$ Responds to the user immediately (very fast). The write to primary storage is done lazily/periodically (asynchronously).
  • Inconsistency/Risk: If the server crashes before the data is dumped to disk, the data is permanently lost.
  • Use Case: Used when very low write latency is critical and a small, tolerable amount of data loss is acceptable (e.g., “liking” a tweet might be OK to lose, but the tweet content itself is not).

🔄 Cache Eviction Policies

When the cache reaches its size limit, an eviction policy determines which existing piece of data to remove to make room for a new entry. The goal is to maximize the Cache Hit Ratio.

1. FIFO (First In, First Out)

  • Policy: Evicts the item that was first added to the cache.
  • Downside: Doesn’t consider how frequently or recently an item has been used. A very popular, old item could be evicted.

2. LRU (Least Recently Used)

  • Policy: Evicts the item that was least recently used (read or written).
  • Mechanism: When an item is accessed, it is moved to the “most recently used” end of the cache. Items at the “least recently used” end are evicted first.
  • Use Case: Excellent for data where popularity is temporal (e.g., Twitter, where old tweets, even if once popular, eventually stop being read).

3. LFU (Least Frequently Used)

  • Policy: Evicts the item that has been used the least number of times overall.
  • Mechanism: Each item has a use count. The item with the lowest count is evicted first.
  • Downside: An item that was extremely popular in the past but is no longer being read will stay in the cache indefinitely because its total usage count is high.