Consistent Hashing
💡 Basic Hashing for Load Balancing
Basic hashing (like using the modulus operator) is a simple method for load balancing, offering an advantage over simple Round Robin in specific scenarios.
The Mechanism
- Take a user identifier (e.g., IP address).
- Hash it, typically by using the modulus function (e.g., $IP \text{ address} \pmod{N}$, where $N$ is the number of servers).
- The remainder determines which server a request is routed to (e.g., if $N=3$, the remainder is $0, 1,$ or $2$, mapping to Server 0, 1, or 2).
The Key Benefit: Consistency
The primary benefit of this approach is request consistency: a user with the same IP address will always be routed to the same server.
The Caching Scenario
This consistency is critical when servers are not stateless and maintain a local, individual cache (like a small Redis cache).
- If a user is consistently routed to Server A, Server A can effectively cache the user’s data.
- In Round Robin, the user might hit Server A, B, and C sequentially. Each server would have to re-fetch or re-process the data because the cached information is on a different server, leading to poor cache utilization.
📉 The Flaw in Basic Hashing (The Reshuffling Problem)
The basic modulo hashing approach fails disastrously when the number of servers ($N$) changes (i.e., a server is added or removed).
The Problem
When a server is removed (e.g., $N$ goes from 3 to 2), the divisor in the modulus function changes.
- Original: $IP \pmod{3}$
- New: $IP \pmod{2}$
Because the divisor changes, the resulting remainder changes for nearly all existing user IP addresses. This causes a massive remapping or reshuffling of users to different servers.
The Resulting Inefficiency
- Cache Invalidation: Users are now routed to a server they have never used before. All the data cached for them on their previous server is useless. The new server must re-process the work, leading to a massive, temporary drop in performance and cache hit ratio.
- Goal Failure: The goal of maintaining consistency to leverage caching is completely undermined.
⭕ Consistent Hashing: The Solution
Consistent Hashing is designed to minimize the number of keys (users) that need to be remapped when a server (node) is added or removed.
The Mechanism (High-Level)
- The Ring: Instead of direct modulus calculation, both the servers (nodes) and the user requests (keys) are mapped onto a conceptual circular space or hash ring (e.g., from $0^\circ$ to $360^\circ$).
- Server Placement: Servers are placed intelligently onto this ring using an industry-grade hash function (not simple modulo $N$).
- Key Mapping: A user’s key (e.g., IP address) is also mapped onto a position on this ring using the same hash function.
- Assignment Rule: A key is assigned to the first server/node found by moving clockwise from the key’s position on the ring.
The Benefit of Node Changes
When a server is removed (e.g., Server 2):
- Stable Keys: Users originally mapped to Servers 0 and 1 remain mapped to Servers 0 and 1 because their clockwise assignment rule has not changed. The local caches on Server 0 and 1 are preserved.
- Minimal Reshuffling: Only the keys (users) that were previously mapped to the removed server are affected. They are simply re-assigned to the next server clockwise (in this case, Server 0).
- Minimal Remapping: The total number of affected keys is limited to the portion of the ring managed by the removed server (ideally $\approx 1/N$).
Load Distribution and Optimization
- Initial Imbalance: A simple removal might cause the adjacent clockwise server to handle a disproportionate amount of traffic (e.g., 2/3 of the load).
- Virtual Nodes (V-Nodes): To solve this, servers often use “Virtual Nodes”, where each physical server is represented by multiple points on the ring. This helps distribute the keys more uniformly and ensures that when a server is removed, its load is split relatively evenly among all the remaining nodes.
🌐 Applications of Consistent Hashing
While complex, Consistent Hashing is a powerful tool used when maintaining consistency is crucial, even amidst changing infrastructure.
| Component | Why Consistent Hashing is Used | Key Hashing/Partitioning By |
|---|---|---|
| Load Balancing | To preserve per-server caching and session data for individual users. | User IP Address or Session ID |
| CDNs (Content Delivery Networks) | To ensure a user’s request for a piece of content (a file) consistently hits the same nearby edge server to maximize local caching. | Content ID or User Geolocation |
| Distributed Databases (Sharding) | To determine which physical database node (shard) holds a specific user’s data. If the number of shards changes, we want to minimize the data movement across the entire cluster. | User ID or Primary Key |
🔑 Summary of Components
The key elements defining a Consistent Hashing system are:
- Hash Key: The unique identifier being mapped (e.g., IP address, User ID, Content ID).
- Hash Function: An industry-grade algorithm (like SHA-1) used to map both the keys and the nodes onto the ring.
- Nodes: The servers, database shards, or CDN caches that are receiving the requests. Their number is expected to change over time.
这是一个很好的问题。虽然在结果上相似,但在表达和侧重点上,它们代表了对Consistent Hashing (一致性哈希)优势的两种不同描述方式,各有侧重。
🔍 概念区分
| 描述方式 | 侧重和解释 |
|---|---|
| Minimal Reshuffling (最小重新洗牌) | 侧重于用户(Key)的行为。它描述的是,当节点(服务器)变化时,只有原本分配给被移除节点的那些用户(Key)需要被重新分配。其他用户保持不变,继续访问它们原来的服务器。 |
| Minimal Remapping (最小重新映射) | 侧重于变化的量化。它提供了一个数量级的估计,即变化的部分占总体的比例。它量化了受影响的 Keys,表明受影响的 Keys 总量大约为 $1/N$($N$ 为节点总数)。 |
🎯 为什么需要两种说法?
-
最小重新洗牌 (Minimal Reshuffling):
- 这个描述强调了一致性哈希的机制是如何工作的。它告诉你哪个群体是受影响的:只有从被移除节点顺时针方向的第一个节点接管了这部分流量。
- 这是一个定性的描述,强调了“只有”这一群 Key 受影响,是理解一致性哈希工作原理的基础。
-
最小重新映射 (Minimal Remapping):
- 这个描述强调了一致性哈希的性能优势。它告诉你受影响的规模。
- 这是一个定量的描述。在理想的均匀分布情况下,如果移除了一个节点,受影响的 Key 数量占总 Key 数量的比例约为 $1/N$。
总结
它们描述的是同一件事的不同侧面:
- Minimal Reshuffling 告诉你:谁需要移动。
- Minimal Remapping 告诉你:多少需要移动($1/N$)。
用一个类比来说明:
- Minimal Reshuffling 就像说:“换教室时,只有三年二班的学生需要移动。”
- Minimal Remapping 就像说:“这次换教室,总共只有**5%**的学生需要移动。”
在系统设计面试中,通常会用 “只有 $K/N$ 个 Key 会被重新映射” (即 $1/N$ 的量级) 来概括一致性哈希的核心优势。