Understanding Database Sharding
What is Sharding?
Database Sharding is a method of splitting a single logical dataset and distributing it across multiple physical databases or "shards". This is a form of Database Partitioning that allows systems to scale beyond the limitations of a single server's computing power or storage capacity.
In a distributed environment, sharding helps in achieving horizontal scalability and improved performance by parallelizing queries across multiple nodes.
Vertical Scaling (Scale Up)
Adding more power (CPU, RAM, SSD) to an existing machine.
- Simple to implement initially.
- Hardware limits (ceiling).
- Single point of failure.
- Expensive at high end.
Horizontal Scaling (Scale Out)
Adding more machines to the pool of resources.
- Unlimited theoretical scale.
- Complex data distribution logic.
- High Availability (redundancy).
- Cost-effective with commodity hardware.
Why Consistent Hashing?
In traditional modulo-based hashing (key % N), changing the number of servers (N) requires remapping almost all keys, causing massive data movement and downtime.
Consistent Hashing solves this by mapping both keys and nodes to a common "ring" (0-360°). When a node is added or removed, only the keys in its immediate vicinity need to move. This capability is critical for DevOps Visualized systems requiring dynamic scaling with zero downtime.

