nodetool replaybatchlog¶

Forces immediate replay of pending batch mutations stored in the batchlog.

Synopsis¶

nodetool [connection_options] replaybatchlog

Description¶

nodetool replaybatchlog triggers immediate processing of any pending batches stored in the local node's batchlog. Under normal operation, Cassandra automatically replays the batchlog, but this command forces immediate replay without waiting for the scheduled interval.

What is a Batch in Cassandra?¶

A batch in Cassandra groups multiple CQL mutations (INSERT, UPDATE, DELETE) into a single logical operation. Batches come in two types:

Type	CQL Syntax	Atomicity	Use Case
Logged batch	`BEGIN BATCH ... APPLY BATCH;`	Guaranteed	Mutations to same partition or requiring atomicity
Unlogged batch	`BEGIN UNLOGGED BATCH ... APPLY BATCH;`	Not guaranteed	Performance optimization for same-partition writes

-- Logged batch (uses batchlog)
BEGIN BATCH
    INSERT INTO users (id, name) VALUES (1, 'Alice');
    INSERT INTO user_emails (user_id, email) VALUES (1, '[email protected]');
APPLY BATCH;

-- Unlogged batch (does NOT use batchlog)
BEGIN UNLOGGED BATCH
    INSERT INTO events (id, data) VALUES (uuid(), 'event1');
    INSERT INTO events (id, data) VALUES (uuid(), 'event2');
APPLY BATCH;

What is the Batchlog?¶

The batchlog is a mechanism that guarantees atomicity for logged batches—ensuring that either all mutations in a batch are applied, or none are (in case of failure, the batch will be retried).

The batchlog is stored in the system.batches table:

-- View pending batches (for debugging)
SELECT * FROM system.batches;

How the Batchlog Works¶

Step-by-step process:

Coordinator receives batch - Client sends a logged batch to coordinator node
Write to batchlog - Coordinator writes the entire batch to batchlog replicas (nodes in different racks for durability)
Execute mutations - Coordinator sends individual mutations to data replicas
Delete batchlog entry - On successful completion, batchlog entry is removed
Automatic replay - If coordinator fails before completing, batchlog replicas detect the stale entry and replay the batch

Why the Batchlog Exists¶

Without the batchlog, a logged batch could be partially applied if:

The coordinator crashes mid-execution
Network partitions occur during mutation delivery
Some replicas fail while others succeed

The batchlog ensures that incomplete batches are eventually completed by having other nodes replay them.

When to Use `replaybatchlog`¶

Scenario 1: After Node Recovery¶

When a node recovers from a crash or extended downtime, it may have pending batches that were written to its batchlog but never completed:

# After node restart, force immediate replay
nodetool replaybatchlog

Why: The automatic replay happens on a schedule. Forcing replay ensures pending batches are processed immediately rather than waiting for the next scheduled interval.

Scenario 2: Investigating Stuck Batches¶

If monitoring shows growing pending batches or application logs indicate batch timeouts:

# Check for pending batches
cqlsh -e "SELECT COUNT(*) FROM system.batches;"

# Force replay to clear backlog
nodetool replaybatchlog

# Verify batches were processed
cqlsh -e "SELECT COUNT(*) FROM system.batches;"

Scenario 3: Before Node Decommission¶

Ensure all pending batches are replayed before removing a node:

# Replay any pending batches
nodetool replaybatchlog

# Verify no pending batches
cqlsh -e "SELECT COUNT(*) FROM system.batches;"

# Proceed with decommission
nodetool decommission

Scenario 4: After Network Partition Resolution¶

Following a network partition that may have caused batch failures:

# Once network is restored
nodetool replaybatchlog

When troubleshooting data consistency issues that may be related to incomplete batches:

# Check batch replay thread status
nodetool tpstats | grep -i batch

# Force replay
nodetool replaybatchlog

# Monitor progress
nodetool tpstats | grep -i batch

Examples¶

Basic Usage¶

nodetool replaybatchlog

Replay on All Nodes¶

#!/bin/bash
# replay_all_batchlogs.sh# Get list of node IPs from local nodetool status


nodes=$(nodetool status | grep "^UN" | awk '{print $2}')

for node in $nodes; do
    echo "Replaying batchlog on $node..."
    ssh "$node" "nodetool replaybatchlog"
done

echo "Batchlog replay triggered on all nodes."

With Verification¶

#!/bin/bash
# Check pending batches, replay, and verify

echo "Pending batches before replay:"
cqlsh -e "SELECT COUNT(*) FROM system.batches;"

echo "Triggering batchlog replay..."
nodetool replaybatchlog

# Wait for replay to process
sleep 5

echo "Pending batches after replay:"
cqlsh -e "SELECT COUNT(*) FROM system.batches;"

Batchlog Automatic Replay¶

Under normal operation, Cassandra automatically replays the batchlog:

Parameter	Default	Description
Replay interval	60 seconds	How often the batchlog is checked for stale entries
Batch timeout	2x write timeout	Time before a batch is considered stale and needs replay

The automatic replay process:

Each node periodically scans batchlogs stored on it
Identifies batches older than the timeout threshold
Replays those batches to the appropriate replicas
Deletes successfully replayed entries

When Automatic Replay Occurs¶

Periodically - Every 60 seconds by default
On node startup - Stale batches are replayed during bootstrap
When triggered manually - Via nodetool replaybatchlog

Batchlog Configuration¶

cassandra.yaml Settings¶

# Throttle for batchlog replay (KB/s)
# Limits I/O impact during replay
batchlog_replay_throttle_in_kb: 1024

# Write request timeout affects batch timeout
write_request_timeout_in_ms: 2000

Runtime Configuration¶

# View current replay throttle
nodetool getbatchlogreplaythrottle

# Adjust throttle (in KB/s)
nodetool setbatchlogreplaythrottle 2048

Monitoring Batchlog¶

Check Pending Batches¶

-- Count pending batches
SELECT COUNT(*) FROM system.batches;

-- View batch details (use sparingly)
SELECT id, version, writetime(version) FROM system.batches LIMIT 10;

Monitor Replay Threads¶

# Check batch replay thread pool
nodetool tpstats | grep -i batch

Example output:

Pool Name                    Active   Pending      Completed   Blocked
BatchlogTasks                     0         0           1523         0

Key metrics to monitor:

Metric	Meaning	Concern Threshold
Pending BatchlogTasks	Batches waiting for replay	> 100
system.batches count	Entries in batchlog table	Growing over time
Batch replay errors	Failed replay attempts	Any non-zero

Impact on Cluster¶

During Replay¶

Aspect	Impact
CPU	Low - batch processing is lightweight
Disk I/O	Moderate - reads batchlog, writes to data tables
Network	Moderate - sends mutations to replicas
Latency	Minimal impact on regular operations

Throttling¶

The batchlog_replay_throttle_in_kb setting limits replay speed to prevent overwhelming the cluster:

# Check current throttle
nodetool getbatchlogreplaythrottle

# If replay is slow but cluster can handle more
nodetool setbatchlogreplaythrottle 4096

# Force replay with higher throughput
nodetool replaybatchlog

Batchlog and Consistency¶

What Batchlog Guarantees¶

Atomicity - All mutations in a batch will eventually be applied
Durability - Batch survives coordinator failure (stored on batchlog replicas)

What Batchlog Does NOT Guarantee¶

Isolation - Other reads may see partial batch results during execution
Immediate consistency - Replayed batches still follow normal replication rules

Batchlog Replica Selection¶

Cassandra selects batchlog replicas to maximize durability:

Prefers nodes in different racks than the coordinator
Falls back to same-rack nodes if necessary
Typically stores on 2 batchlog replicas

Troubleshooting¶

Batchlog Growing Continuously¶

If system.batches keeps growing:

# Check for failing replays
grep -i "batch" /var/log/cassandra/system.log | grep -i "error\|fail"

# Check target replicas are healthy
nodetool status

# Check for resource constraints
nodetool tpstats

Common causes:

Target replicas are down
Network connectivity issues
Throttle set too low for batch volume

Replay Not Completing¶

# Check thread pool status
nodetool tpstats | grep -i batch

# Look for blocked threads
nodetool tpstats | grep -i blocked

# Check logs for errors
tail -100 /var/log/cassandra/system.log | grep -i batch

Batches Taking Too Long¶

# Increase replay throttle
nodetool setbatchlogreplaythrottle 4096

# Force replay
nodetool replaybatchlog

# Monitor progress
watch -n 5 'cqlsh -e "SELECT COUNT(*) FROM system.batches;"'

Best Practices¶

Batchlog Guidelines

Use logged batches sparingly - They add overhead; use only when atomicity is required
Prefer unlogged batches for same-partition operations - No batchlog overhead
Keep batches small - Large batches increase batchlog storage and replay time
Monitor batchlog size - Growing system.batches indicates issues
Run replay after recovery - Don't wait for automatic replay after incidents

Batch Anti-Patterns

Avoid these common mistakes:

Using batches for bulk loading (use UNLOGGED or async writes instead)
Batching unrelated mutations across many partitions
Very large batches (>100 mutations)
Using batches as a "transaction" mechanism across tables

Logged vs Unlogged Batches¶

Aspect	Logged Batch	Unlogged Batch
Batchlog used	Yes	No
Atomicity guaranteed	Yes	No
Coordinator failure handling	Batch replayed	Partial write possible
Performance overhead	Higher	Lower
Use case	Cross-partition atomicity	Same-partition optimization

-- Use LOGGED (default) when atomicity matters
BEGIN BATCH
    UPDATE account SET balance = balance - 100 WHERE id = 1;
    UPDATE account SET balance = balance + 100 WHERE id = 2;
APPLY BATCH;

-- Use UNLOGGED when atomicity doesn't matter (same partition)
BEGIN UNLOGGED BATCH
    INSERT INTO user_events (user_id, event_id, data) VALUES (123, uuid(), 'e1');
    INSERT INTO user_events (user_id, event_id, data) VALUES (123, uuid(), 'e2');
APPLY BATCH;

Command	Relationship
tpstats	View BatchlogTasks thread pool status
getbatchlogreplaythrottle	View current replay throttle
setbatchlogreplaythrottle	Adjust replay throttle
status	Check replica node health