nodetool replaybatchlog¶
Forces immediate replay of pending batch mutations stored in the batchlog.
Synopsis¶
nodetool [connection_options] replaybatchlog
Description¶
nodetool replaybatchlog triggers immediate processing of any pending batches stored in the local node's batchlog. Under normal operation, Cassandra automatically replays the batchlog, but this command forces immediate replay without waiting for the scheduled interval.
What is a Batch in Cassandra?¶
A batch in Cassandra groups multiple CQL mutations (INSERT, UPDATE, DELETE) into a single logical operation. Batches come in two types:
| Type | CQL Syntax | Atomicity | Use Case |
|---|---|---|---|
| Logged batch | BEGIN BATCH ... APPLY BATCH; |
Guaranteed | Mutations to same partition or requiring atomicity |
| Unlogged batch | BEGIN UNLOGGED BATCH ... APPLY BATCH; |
Not guaranteed | Performance optimization for same-partition writes |
-- Logged batch (uses batchlog)
BEGIN BATCH
INSERT INTO users (id, name) VALUES (1, 'Alice');
INSERT INTO user_emails (user_id, email) VALUES (1, '[email protected]');
APPLY BATCH;
-- Unlogged batch (does NOT use batchlog)
BEGIN UNLOGGED BATCH
INSERT INTO events (id, data) VALUES (uuid(), 'event1');
INSERT INTO events (id, data) VALUES (uuid(), 'event2');
APPLY BATCH;
What is the Batchlog?¶
The batchlog is a mechanism that guarantees atomicity for logged batches—ensuring that either all mutations in a batch are applied, or none are (in case of failure, the batch will be retried).
The batchlog is stored in the system.batches table:
-- View pending batches (for debugging)
SELECT * FROM system.batches;
How the Batchlog Works¶
Step-by-step process:
- Coordinator receives batch - Client sends a logged batch to coordinator node
- Write to batchlog - Coordinator writes the entire batch to batchlog replicas (nodes in different racks for durability)
- Execute mutations - Coordinator sends individual mutations to data replicas
- Delete batchlog entry - On successful completion, batchlog entry is removed
- Automatic replay - If coordinator fails before completing, batchlog replicas detect the stale entry and replay the batch
Why the Batchlog Exists¶
Without the batchlog, a logged batch could be partially applied if:
- The coordinator crashes mid-execution
- Network partitions occur during mutation delivery
- Some replicas fail while others succeed
The batchlog ensures that incomplete batches are eventually completed by having other nodes replay them.
When to Use replaybatchlog¶
Scenario 1: After Node Recovery¶
When a node recovers from a crash or extended downtime, it may have pending batches that were written to its batchlog but never completed:
# After node restart, force immediate replay
nodetool replaybatchlog
Why: The automatic replay happens on a schedule. Forcing replay ensures pending batches are processed immediately rather than waiting for the next scheduled interval.
Scenario 2: Investigating Stuck Batches¶
If monitoring shows growing pending batches or application logs indicate batch timeouts:
# Check for pending batches
cqlsh -e "SELECT COUNT(*) FROM system.batches;"
# Force replay to clear backlog
nodetool replaybatchlog
# Verify batches were processed
cqlsh -e "SELECT COUNT(*) FROM system.batches;"
Scenario 3: Before Node Decommission¶
Ensure all pending batches are replayed before removing a node:
# Replay any pending batches
nodetool replaybatchlog
# Verify no pending batches
cqlsh -e "SELECT COUNT(*) FROM system.batches;"
# Proceed with decommission
nodetool decommission
Scenario 4: After Network Partition Resolution¶
Following a network partition that may have caused batch failures:
# Once network is restored
nodetool replaybatchlog
Scenario 5: Debugging Batch-Related Issues¶
When troubleshooting data consistency issues that may be related to incomplete batches:
# Check batch replay thread status
nodetool tpstats | grep -i batch
# Force replay
nodetool replaybatchlog
# Monitor progress
nodetool tpstats | grep -i batch
Examples¶
Basic Usage¶
nodetool replaybatchlog
Replay on All Nodes¶
#!/bin/bash
# replay_all_batchlogs.sh# Get list of node IPs from local nodetool status
nodes=$(nodetool status | grep "^UN" | awk '{print $2}')
for node in $nodes; do
echo "Replaying batchlog on $node..."
ssh "$node" "nodetool replaybatchlog"
done
echo "Batchlog replay triggered on all nodes."
With Verification¶
#!/bin/bash
# Check pending batches, replay, and verify
echo "Pending batches before replay:"
cqlsh -e "SELECT COUNT(*) FROM system.batches;"
echo "Triggering batchlog replay..."
nodetool replaybatchlog
# Wait for replay to process
sleep 5
echo "Pending batches after replay:"
cqlsh -e "SELECT COUNT(*) FROM system.batches;"
Batchlog Automatic Replay¶
Under normal operation, Cassandra automatically replays the batchlog:
| Parameter | Default | Description |
|---|---|---|
| Replay interval | 60 seconds | How often the batchlog is checked for stale entries |
| Batch timeout | 2x write timeout | Time before a batch is considered stale and needs replay |
The automatic replay process:
- Each node periodically scans batchlogs stored on it
- Identifies batches older than the timeout threshold
- Replays those batches to the appropriate replicas
- Deletes successfully replayed entries
When Automatic Replay Occurs¶
- Periodically - Every 60 seconds by default
- On node startup - Stale batches are replayed during bootstrap
- When triggered manually - Via
nodetool replaybatchlog
Batchlog Configuration¶
cassandra.yaml Settings¶
# Throttle for batchlog replay (KB/s)
# Limits I/O impact during replay
batchlog_replay_throttle_in_kb: 1024
# Write request timeout affects batch timeout
write_request_timeout_in_ms: 2000
Runtime Configuration¶
# View current replay throttle
nodetool getbatchlogreplaythrottle
# Adjust throttle (in KB/s)
nodetool setbatchlogreplaythrottle 2048
Monitoring Batchlog¶
Check Pending Batches¶
-- Count pending batches
SELECT COUNT(*) FROM system.batches;
-- View batch details (use sparingly)
SELECT id, version, writetime(version) FROM system.batches LIMIT 10;
Monitor Replay Threads¶
# Check batch replay thread pool
nodetool tpstats | grep -i batch
Example output:
Pool Name Active Pending Completed Blocked
BatchlogTasks 0 0 1523 0
Watch for Batch-Related Metrics¶
Key metrics to monitor:
| Metric | Meaning | Concern Threshold |
|---|---|---|
| Pending BatchlogTasks | Batches waiting for replay | > 100 |
| system.batches count | Entries in batchlog table | Growing over time |
| Batch replay errors | Failed replay attempts | Any non-zero |
Impact on Cluster¶
During Replay¶
| Aspect | Impact |
|---|---|
| CPU | Low - batch processing is lightweight |
| Disk I/O | Moderate - reads batchlog, writes to data tables |
| Network | Moderate - sends mutations to replicas |
| Latency | Minimal impact on regular operations |
Throttling¶
The batchlog_replay_throttle_in_kb setting limits replay speed to prevent overwhelming the cluster:
# Check current throttle
nodetool getbatchlogreplaythrottle
# If replay is slow but cluster can handle more
nodetool setbatchlogreplaythrottle 4096
# Force replay with higher throughput
nodetool replaybatchlog
Batchlog and Consistency¶
What Batchlog Guarantees¶
- Atomicity - All mutations in a batch will eventually be applied
- Durability - Batch survives coordinator failure (stored on batchlog replicas)
What Batchlog Does NOT Guarantee¶
- Isolation - Other reads may see partial batch results during execution
- Immediate consistency - Replayed batches still follow normal replication rules
Batchlog Replica Selection¶
Cassandra selects batchlog replicas to maximize durability:
- Prefers nodes in different racks than the coordinator
- Falls back to same-rack nodes if necessary
- Typically stores on 2 batchlog replicas
Troubleshooting¶
Batchlog Growing Continuously¶
If system.batches keeps growing:
# Check for failing replays
grep -i "batch" /var/log/cassandra/system.log | grep -i "error\|fail"
# Check target replicas are healthy
nodetool status
# Check for resource constraints
nodetool tpstats
Common causes:
- Target replicas are down
- Network connectivity issues
- Throttle set too low for batch volume
Replay Not Completing¶
# Check thread pool status
nodetool tpstats | grep -i batch
# Look for blocked threads
nodetool tpstats | grep -i blocked
# Check logs for errors
tail -100 /var/log/cassandra/system.log | grep -i batch
Batches Taking Too Long¶
# Increase replay throttle
nodetool setbatchlogreplaythrottle 4096
# Force replay
nodetool replaybatchlog
# Monitor progress
watch -n 5 'cqlsh -e "SELECT COUNT(*) FROM system.batches;"'
Best Practices¶
Batchlog Guidelines
- Use logged batches sparingly - They add overhead; use only when atomicity is required
- Prefer unlogged batches for same-partition operations - No batchlog overhead
- Keep batches small - Large batches increase batchlog storage and replay time
- Monitor batchlog size - Growing
system.batchesindicates issues - Run replay after recovery - Don't wait for automatic replay after incidents
Batch Anti-Patterns
Avoid these common mistakes:
- Using batches for bulk loading (use
UNLOGGEDor async writes instead) - Batching unrelated mutations across many partitions
- Very large batches (>100 mutations)
- Using batches as a "transaction" mechanism across tables
Logged vs Unlogged Batches¶
| Aspect | Logged Batch | Unlogged Batch |
|---|---|---|
| Batchlog used | Yes | No |
| Atomicity guaranteed | Yes | No |
| Coordinator failure handling | Batch replayed | Partial write possible |
| Performance overhead | Higher | Lower |
| Use case | Cross-partition atomicity | Same-partition optimization |
-- Use LOGGED (default) when atomicity matters
BEGIN BATCH
UPDATE account SET balance = balance - 100 WHERE id = 1;
UPDATE account SET balance = balance + 100 WHERE id = 2;
APPLY BATCH;
-- Use UNLOGGED when atomicity doesn't matter (same partition)
BEGIN UNLOGGED BATCH
INSERT INTO user_events (user_id, event_id, data) VALUES (123, uuid(), 'e1');
INSERT INTO user_events (user_id, event_id, data) VALUES (123, uuid(), 'e2');
APPLY BATCH;
Related Commands¶
| Command | Relationship |
|---|---|
| tpstats | View BatchlogTasks thread pool status |
| getbatchlogreplaythrottle | View current replay throttle |
| setbatchlogreplaythrottle | Adjust replay throttle |
| status | Check replica node health |