Load Balancing Policy¶
The load balancing policy determines which nodes receive requests. This policy directly affects latency, throughput, and cluster load distribution.
How Load Balancing Works¶
For each request, the load balancing policy returns an ordered list of nodes to try:
The driver sends the request to the first node. If that fails and the retry policy allows retry, the next node in the list is tried.
Token-Aware Routing¶
Token-aware load balancing sends requests directly to replica nodes, avoiding an extra network hop:
Requirements for Token-Aware Routing¶
Token-aware routing requires:
- Partition key known — The driver must be able to extract the partition key from the query
- Prepared statements — Simple statements without bound parameters cannot be routed token-aware
- Metadata available — Driver must have current token map
// Token-aware: partition key is bound
PreparedStatement prepared = session.prepare(
"SELECT * FROM users WHERE user_id = ?");
BoundStatement bound = prepared.bind(userId); // Driver knows partition key
session.execute(bound); // Routes to replica
// NOT token-aware: partition key embedded in query string
SimpleStatement simple = SimpleStatement.newInstance(
"SELECT * FROM users WHERE user_id = 'abc123'");
session.execute(simple); // Cannot extract partition key, uses round-robin
Datacenter Awareness¶
In multi-datacenter deployments, the load balancing policy must be configured with the local datacenter:
Configuration¶
// Java driver
CqlSession session = CqlSession.builder()
.withLocalDatacenter("dc1")
.build();
# Python driver
from cassandra.policies import DCAwareRoundRobinPolicy
cluster = Cluster(
contact_points=['10.0.1.1'],
load_balancing_policy=DCAwareRoundRobinPolicy(local_dc='dc1')
)
Failure to configure local datacenter correctly results in requests potentially routing to remote datacenters with significantly higher latency.
Common Load Balancing Policies¶
Round-Robin (Basic)¶
Distributes requests evenly across all nodes without considering replicas:
| Advantage | Disadvantage |
|---|---|
| Simple, predictable | Extra network hop for every request |
| Even distribution | No datacenter awareness |
Use case: Development environments, specific analytics workloads.
Datacenter-Aware Round-Robin¶
Round-robin within local datacenter only:
| Advantage | Disadvantage |
|---|---|
| Respects datacenter locality | Still not token-aware |
| Predictable distribution within DC | Extra hop for most requests |
Use case: When token-aware routing is not possible (e.g., many simple statements).
Token-Aware with DC Awareness (Recommended)¶
Combines token-aware routing with datacenter preference:
Algorithm:
1. Calculate replica set for partition key
2. Filter to local datacenter replicas
3. Order by health/load (implementation varies)
4. Append non-replica local nodes as fallback
5. Optionally append remote DC nodes as last resort
| Advantage | Disadvantage |
|---|---|
| Minimum latency (direct to replica) | Requires prepared statements for full benefit |
| Respects datacenter locality | Slightly more complex configuration |
| Built-in fallback ordering |
This is the default and recommended policy for most production deployments.
Rack Awareness¶
Some load balancing policies consider rack placement to improve fault tolerance:
Rack awareness provides marginal latency improvement when:
- Application servers are rack-aligned with Cassandra nodes
- Network topology has rack-level latency differences
Filtering Unhealthy Nodes¶
Load balancing policies typically exclude nodes that are:
| Condition | Behavior |
|---|---|
| Marked DOWN | Excluded from query plan |
| Recently failed | May be deprioritized (implementation varies) |
| High latency | Some policies track latency and avoid slow nodes |
| Overloaded | Some policies consider in-flight request count |
Latency-Aware Routing¶
Some drivers offer latency-aware policies that track response times and prefer faster nodes:
Considerations:
- Latency tracking adds overhead
- May cause herding (all clients avoid same node simultaneously)
- Typically combined with, not replacing, token-aware routing
Configuration Recommendations¶
| Deployment | Recommended Policy |
|---|---|
| Single datacenter | Token-aware with round-robin fallback |
| Multi-datacenter | Token-aware with DC awareness |
| Analytics/batch | Round-robin or DC-aware round-robin |
| Latency-sensitive | Token-aware with latency tracking |
Anti-Patterns¶
| Anti-Pattern | Problem |
|---|---|
| No local DC configured in multi-DC | Requests may route cross-DC |
| Round-robin for OLTP workloads | Unnecessary latency for every request |
| Token-aware without prepared statements | Falls back to round-robin anyway |
Related Documentation¶
- Retry Policy — What happens when the selected node fails
- Speculative Execution — Sending to multiple nodes concurrently