Caching
💡 Core Concepts of Caching
Caching is a strategy to store copies of data in a faster, typically smaller, storage layer closer to the consumer (CPU, client, or server process).
- Goal: The primary goals are to lower latency (faster access) and increase throughput (handle more requests per second). It can also decrease network cost by reducing data sent over the network.
- Trade-off: Caching often involves a trade-off where you sacrifice some consistency for speed, as the cached data might be slightly out-of-date or “stale.”
- Examples:
- Single Computer: CPU Cache (fastest, smallest) $\rightarrow$ RAM $\rightarrow$ Disk (slowest, largest).
- Distributed Systems: In-memory store (like Redis) $\rightarrow$ Database (Disk storage).
💻 Client-Side Caching (Browser Caching)
The web browser uses caching to store static content (like JavaScript, CSS, or HTML files) on the user’s local disk, which is still much faster than a network request.
- Static Content: The data must be content that is not expected to change frequently.
- Mechanism: The server uses the
Cache-ControlHTTP header (e.g.,max-age=3600) in the response to instruct the browser on how long (in seconds) it can cache the resource. - Performance: Reading from a local disk cache is typically in the order of milliseconds, whereas a full network request can take 0.1 seconds or more.
🎯 Cache Hits and Misses
This terminology is used across all levels of caching to describe the outcome of a request to the cache:
- Cache Hit: The requested data is found in the cache and is not expired (stale). The request is served quickly from the cache.
- Cache Miss: The requested data is not found in the cache, or it is found but has expired (is stale). A request must then be made to the slower, primary data store (e.g., disk or origin server).
- Cache Ratio: The measure of caching efficiency. $$\text{Cache Ratio} = \frac{\text{Total Hits}}{\text{Total Hits} + \text{Total Misses}}$$ The goal is to maximize this ratio.
💾 Server-Side Caching Strategies
In a server environment (like a Twitter backend), caching typically uses a fast, in-memory Key-Value store (e.g., Redis) in front of a slower, persistent database (Disk). The choice of strategy depends on the application’s tolerance for latency and consistency.
1. Write Around
- Write Operation: Skips the cache and writes directly to the primary storage (disk).
- Read Operation: Cache Miss on the first read $\rightarrow$ Read from disk $\rightarrow$ Store in cache $\rightarrow$ Return to user.
- Use Case: Only puts data in the cache if it’s actually being read. Useful when a small subset of data is popular, like popular tweets (avoiding caching millions of never-read tweets).
2. Write Through
- Write Operation: Writes to the cache $\rightarrow$ Then synchronously writes to the primary storage.
- Read Operation: Always finds the freshest data in the cache (Cache Hit).
- Use Case: Simple, ensures data is always consistent and in the cache immediately, but the write latency is higher because it waits for the slower primary storage write.
3. Write Back
- Write Operation: Writes only to the cache $\rightarrow$ Responds to the user immediately (very fast). The write to primary storage is done lazily/periodically (asynchronously).
- Inconsistency/Risk: If the server crashes before the data is dumped to disk, the data is permanently lost.
- Use Case: Used when very low write latency is critical and a small, tolerable amount of data loss is acceptable (e.g., “liking” a tweet might be OK to lose, but the tweet content itself is not).
🔄 Cache Eviction Policies
When the cache reaches its size limit, an eviction policy determines which existing piece of data to remove to make room for a new entry. The goal is to maximize the Cache Hit Ratio.
1. FIFO (First In, First Out)
- Policy: Evicts the item that was first added to the cache.
- Downside: Doesn’t consider how frequently or recently an item has been used. A very popular, old item could be evicted.
2. LRU (Least Recently Used)
- Policy: Evicts the item that was least recently used (read or written).
- Mechanism: When an item is accessed, it is moved to the “most recently used” end of the cache. Items at the “least recently used” end are evicted first.
- Use Case: Excellent for data where popularity is temporal (e.g., Twitter, where old tweets, even if once popular, eventually stop being read).
3. LFU (Least Frequently Used)
- Policy: Evicts the item that has been used the least number of times overall.
- Mechanism: Each item has a use count. The item with the lowest count is evicted first.
- Downside: An item that was extremely popular in the past but is no longer being read will stay in the cache indefinitely because its total usage count is high.