Caching

💡 Core Concepts of Caching

Caching is a strategy to store copies of data in a faster, typically smaller, storage layer closer to the consumer (CPU, client, or server process).

Goal: The primary goals are to lower latency (faster access) and increase throughput (handle more requests per second). It can also decrease network cost by reducing data sent over the network.
Trade-off: Caching often involves a trade-off where you sacrifice some consistency for speed, as the cached data might be slightly out-of-date or “stale.”
Examples:
- Single Computer: CPU Cache (fastest, smallest) $\rightarrow$ RAM $\rightarrow$ Disk (slowest, largest).
- Distributed Systems: In-memory store (like Redis) $\rightarrow$ Database (Disk storage).

💻 Client-Side Caching (Browser Caching)

The web browser uses caching to store static content (like JavaScript, CSS, or HTML files) on the user’s local disk, which is still much faster than a network request.

Static Content: The data must be content that is not expected to change frequently.
Mechanism: The server uses the Cache-Control HTTP header (e.g., max-age=3600) in the response to instruct the browser on how long (in seconds) it can cache the resource.
Performance: Reading from a local disk cache is typically in the order of milliseconds, whereas a full network request can take 0.1 seconds or more.

🎯 Cache Hits and Misses

This terminology is used across all levels of caching to describe the outcome of a request to the cache:

Cache Hit: The requested data is found in the cache and is not expired (stale). The request is served quickly from the cache.
Cache Miss: The requested data is not found in the cache, or it is found but has expired (is stale). A request must then be made to the slower, primary data store (e.g., disk or origin server).
Cache Ratio: The measure of caching efficiency. $$\text{Cache Ratio} = \frac{\text{Total Hits}}{\text{Total Hits} + \text{Total Misses}}$$ The goal is to maximize this ratio.

💾 Server-Side Caching Strategies

In a server environment (like a Twitter backend), caching typically uses a fast, in-memory Key-Value store (e.g., Redis) in front of a slower, persistent database (Disk). The choice of strategy depends on the application’s tolerance for latency and consistency.

1. Write Around

Write Operation: Skips the cache and writes directly to the primary storage (disk).
Read Operation: Cache Miss on the first read $\rightarrow$ Read from disk $\rightarrow$ Store in cache $\rightarrow$ Return to user.
Use Case: Only puts data in the cache if it’s actually being read. Useful when a small subset of data is popular, like popular tweets (avoiding caching millions of never-read tweets).

2. Write Through

Write Operation: Writes to the cache $\rightarrow$ Then synchronously writes to the primary storage.
Read Operation: Always finds the freshest data in the cache (Cache Hit).
Use Case: Simple, ensures data is always consistent and in the cache immediately, but the write latency is higher because it waits for the slower primary storage write.

3. Write Back

Write Operation: Writes only to the cache $\rightarrow$ Responds to the user immediately (very fast). The write to primary storage is done lazily/periodically (asynchronously).
Inconsistency/Risk: If the server crashes before the data is dumped to disk, the data is permanently lost.
Use Case: Used when very low write latency is critical and a small, tolerable amount of data loss is acceptable (e.g., “liking” a tweet might be OK to lose, but the tweet content itself is not).

🔄 Cache Eviction Policies

When the cache reaches its size limit, an eviction policy determines which existing piece of data to remove to make room for a new entry. The goal is to maximize the Cache Hit Ratio.

1. FIFO (First In, First Out)

Policy: Evicts the item that was first added to the cache.
Downside: Doesn’t consider how frequently or recently an item has been used. A very popular, old item could be evicted.

2. LRU (Least Recently Used)

Policy: Evicts the item that was least recently used (read or written).
Mechanism: When an item is accessed, it is moved to the “most recently used” end of the cache. Items at the “least recently used” end are evicted first.
Use Case: Excellent for data where popularity is temporal (e.g., Twitter, where old tweets, even if once popular, eventually stop being read).

3. LFU (Least Frequently Used)

Policy: Evicts the item that has been used the least number of times overall.
Mechanism: Each item has a use count. The item with the lowest count is evicted first.
Downside: An item that was extremely popular in the past but is no longer being read will stay in the cache indefinitely because its total usage count is high.