BATCH¶
The BATCH statement groups multiple INSERT, UPDATE, and DELETE statements into a single atomic operation. Batches ensure that all statements either complete or none do, providing atomicity guarantees. However, batches are frequently misused as a performance optimization—they are not.
Behavioral Guarantees¶
What BATCH Guarantees¶
- All statements in a logged batch eventually execute or none execute
- Statements to the same partition are atomic at the storage layer
- Logged batches write to batch log before executing mutations
- USING TIMESTAMP applies the same timestamp to all statements
- If any IF condition fails, no statements in the batch execute
What BATCH Does NOT Guarantee¶
Undefined Behavior
The following behaviors are undefined and must not be relied upon:
- Isolation: Other reads may see partial batch results before completion
- Performance improvement: Batches do not improve throughput over parallel writes
- Cross-datacenter atomicity: Logged batches provide atomicity within a datacenter, not globally
- Unlogged batch atomicity: UNLOGGED batches have no atomicity guarantee on coordinator failure
- Order of execution: Statements within a batch may execute in any order
Logged vs Unlogged Contracts¶
| Aspect | Logged Batch | Unlogged Batch |
|---|---|---|
| Coordinator failure | Recovers via batch log | Undefined - partial execution possible |
| Multi-partition | Atomic (via batch log) | Not atomic |
| Same-partition | Atomic | Atomic (storage layer) |
| Performance | Higher latency (batch log overhead) | Lower latency |
Failure Semantics¶
| Failure Mode | Logged Batch | Unlogged Batch |
|---|---|---|
| Coordinator fails after batch log write | Batch log replays mutations | Partial execution possible |
| Coordinator fails before batch log write | Not applied | Not applied |
WriteTimeoutException |
May have been logged for replay | Undefined |
UnavailableException |
Not applied | Not applied |
Version-Specific Behavior¶
| Version | Behavior |
|---|---|
| 1.2+ | Basic BATCH support |
| 2.0+ | UNLOGGED BATCH, improved batch log (CASSANDRA-4542) |
| 2.1+ | COUNTER BATCH separated from regular batches |
| 3.0+ | Improved batch log performance (CASSANDRA-9673) |
| 4.0+ | Enhanced multi-partition batch handling |
Overview¶
What BATCH Is For¶
BATCH provides atomicity, not performance:
| Use Case | Example | Result |
|---|---|---|
| Correct: Atomicity | INSERT INTO users ...INSERT INTO users_by_email ... |
Both succeed or both fail ✓ |
| Wrong: 'Performance' | INSERT INTO table_a ...INSERT INTO table_b ...INSERT INTO table_c ... |
Slower than individual inserts! ✗ |
Historical Context¶
BATCH was introduced in CQL 3.0 (Cassandra 1.2) to address the problem of maintaining consistency across denormalized tables:
| Version | Feature |
|---|---|
| 1.2 | Basic BATCH with atomicity |
| 2.0 | UNLOGGED BATCH, batch log improvements |
| 2.1 | COUNTER BATCH separated |
| 3.0 | Improved batch log efficiency |
| 4.0 | Better multi-partition batch handling |
Synopsis¶
BEGIN [ UNLOGGED | COUNTER ] BATCH
[ USING TIMESTAMP *microseconds* ]
*dml_statement* ;
[ *dml_statement* ; ... ]
APPLY BATCH
dml_statement:
INSERT ... | UPDATE ... | DELETE ...
Batch Types¶
Logged Batch (Default)¶
BEGIN BATCH
INSERT INTO users (user_id, username) VALUES (?, 'alice');
INSERT INTO usernames (username, user_id) VALUES ('alice', ?);
APPLY BATCH;
Execution:
Batch Log Architecture¶
The batch log ensures durability by writing to multiple nodes before executing mutations:
- Coordinator selects batch log endpoints: Two nodes in the local datacenter (chosen to minimize latency)
- Batch log written: The serialized batch is written to each endpoint's local batch log table
- Mutations executed: After batch log is durable, mutations are sent to replicas
- Batch log removed: After all mutations acknowledged, batch log entries are deleted
Batch Log Storage
The batch log uses LocalStrategy (RF=1), meaning each node stores only its own batch log entries. Durability comes from writing to multiple endpoints, not from replication.
Batch Log Table Versions¶
The batch log implementation has evolved across Cassandra versions:
| Version | Table | Key Changes |
|---|---|---|
| < 2.2 | system.batchlog |
Original implementation |
| 2.2+ | system.batches |
New table format, improved performance |
| 3.0+ | system.batches |
Batch log replay improvements |
| 4.0+ | system.batches |
Enhanced timeout handling |
-- View pending batches (should normally be empty)
SELECT * FROM system.batches;
-- Columns vary by version, but typically include:
-- id (timeuuid), version (int), written_at (timestamp), data (blob)
Batch Log Replay¶
Batch log entries are replayed automatically:
- Each node periodically scans its local batch log
- Entries older than
batchlog_replay_throttleare candidates for replay - Mutations are re-executed to ensure completion
- Successfully replayed entries are removed
# cassandra.yaml
batchlog_replay_throttle_in_kb: 1024 # Throttle replay bandwidth
Guarantees:
- All statements eventually execute or none do
- Coordinator failure doesn't lose the batch (batch log on other nodes)
- Batch log written to 2 endpoints for redundancy
Cost:
- Additional writes to 2 batch log endpoints
- Higher latency than unlogged (must wait for batch log durability)
- Increased coordinator memory usage
Unlogged Batch¶
BEGIN UNLOGGED BATCH
UPDATE user_profile SET name = 'Alice' WHERE user_id = ?;
UPDATE user_profile SET email = '[email protected]' WHERE user_id = ?;
UPDATE user_profile SET updated_at = toTimestamp(now()) WHERE user_id = ?;
APPLY BATCH;
Execution:
- No batch log write
- Statements sent directly to replicas
- No recovery if coordinator fails
Use when:
- All statements target the same partition
- Atomicity across coordinator failure not required
- Lower latency needed
Counter Batch¶
BEGIN COUNTER BATCH
UPDATE page_stats SET views = views + 1 WHERE page_id = 'home';
UPDATE page_stats SET views = views + 1 WHERE page_id = 'about';
UPDATE daily_stats SET requests = requests + 2 WHERE date = '2024-01-15';
APPLY BATCH;
Restrictions:
- Can only contain counter updates
- Cannot mix counter and non-counter statements
- No TTL allowed
Parameters¶
USING TIMESTAMP¶
Applies a single timestamp to all statements:
BEGIN BATCH USING TIMESTAMP 1705315800000000
INSERT INTO table1 (id, data) VALUES (1, 'a');
INSERT INTO table2 (id, data) VALUES (2, 'b');
APPLY BATCH;
Behavior:
- All mutations share the same timestamp
- Individual statements cannot override
- Cannot be used with LWT (IF conditions)
Same-Partition vs Multi-Partition Batches¶
Same-Partition Batch (Recommended)¶
All statements target the same partition:
-- Good: Single partition batch
BEGIN UNLOGGED BATCH
INSERT INTO user_events (user_id, event_id, type) VALUES (123, uuid(), 'login');
INSERT INTO user_events (user_id, event_id, type) VALUES (123, uuid(), 'page_view');
INSERT INTO user_events (user_id, event_id, type) VALUES (123, uuid(), 'click');
APPLY BATCH;
Benefits:
- Single coordinator to single replica set
- Atomic at storage layer (same SSTable)
- Minimal coordination overhead
Multi-Partition Batch (Use Carefully)¶
Statements target different partitions:
-- Acceptable: Denormalized tables that must stay consistent
BEGIN BATCH
INSERT INTO users (user_id, email) VALUES (?, '[email protected]');
INSERT INTO users_by_email (email, user_id) VALUES ('[email protected]', ?);
APPLY BATCH;
Costs:
- Coordinator must contact multiple nodes
- Batch log adds latency
- Memory pressure on coordinator
Batch Anti-Patterns¶
Anti-Pattern 1: Batching for Performance¶
-- WRONG: Batching unrelated writes doesn't improve performance
BEGIN BATCH
INSERT INTO users (user_id, name) VALUES (1, 'Alice');
INSERT INTO users (user_id, name) VALUES (2, 'Bob');
INSERT INTO users (user_id, name) VALUES (3, 'Charlie');
-- ... 100 more unrelated inserts
APPLY BATCH;
Why it's slow:
- Coordinator must track all mutations in memory
- Single point of coordination
- Larger network payload than parallel requests
Better approach:
// Parallel async inserts
List<CompletionStage<AsyncResultSet>> futures = new ArrayList<>();
for (User user : users) {
futures.add(session.executeAsync(insertStmt.bind(user)));
}
CompletableFuture.allOf(futures.toArray()).join();
Anti-Pattern 2: Large Batches¶
-- WRONG: Batch too large
BEGIN BATCH
-- 1000 INSERT statements
APPLY BATCH;
Problems:
- Exceeds batch size thresholds
- Coordinator memory exhaustion
- Timeout likelihood increases
Thresholds (cassandra.yaml):
batch_size_warn_threshold_in_kb: 5
batch_size_fail_threshold_in_kb: 50
Anti-Pattern 3: Batching Different Tables Without Need¶
-- WRONG: No atomicity requirement
BEGIN BATCH
INSERT INTO audit_log (id, action) VALUES (uuid(), 'user_created');
INSERT INTO metrics (id, count) VALUES ('users', 1);
INSERT INTO notifications (id, message) VALUES (uuid(), 'Welcome!');
APPLY BATCH;
Better approach:
Execute independently—if one fails, others can still succeed.
Conditional Batches (LWT)¶
Batches can include lightweight transaction conditions:
BEGIN BATCH
INSERT INTO users (user_id, username) VALUES (?, 'alice') IF NOT EXISTS;
INSERT INTO usernames (username, user_id) VALUES ('alice', ?) IF NOT EXISTS;
APPLY BATCH;
LWT Batch Behavior¶
Important: When ANY statement has an IF condition, ALL statements use Paxos:
Key points:
- Entire batch is all-or-nothing
- One failed condition aborts all statements
- Significant performance impact
Restrictions¶
- Cannot mix conditional and unconditional statements
- All conditions must be on same partition
- Cannot use USING TIMESTAMP with IF
Batch Size Limits¶
Configuration¶
# cassandra.yaml
batch_size_warn_threshold_in_kb: 5 # Log warning
batch_size_fail_threshold_in_kb: 50 # Reject batch
Monitoring¶
# Check for batch size warnings
grep "Batch" /var/log/cassandra/system.log
# Metrics
nodetool tablestats system.batches
Calculating Batch Size¶
Approximate formula:
batch_size ≈ sum(mutation_sizes) + overhead
mutation_size ≈ key_size + sum(column_sizes) + metadata
Best Practices¶
Do Use Batches For¶
Good Batch Use Cases
- Denormalized table consistency: Keep related tables in sync
- Same-partition atomicity: Multiple writes to one partition
- Conditional group operations: LWT across related rows
Don't Use Batches For¶
Batch Anti-Patterns
- Performance optimization: Use async parallel writes instead
- Large bulk loads: Use SSTable loader or async writes
- Unrelated writes: No atomicity needed
- Cross-datacenter atomicity: Batches don't provide this
Sizing Guidelines¶
| Batch Type | Recommended Size |
|---|---|
| Same-partition | Up to 100 statements |
| Multi-partition | 2-5 partitions |
| With LWT | 2-3 statements |
Monitoring¶
-- Check batch log table
SELECT * FROM system.batches;
-- Should typically be empty (batches complete quickly)
Restrictions¶
Restrictions
Statement Types:
- Only INSERT, UPDATE, DELETE allowed
- Cannot include SELECT statements
- Cannot include DDL statements
Counter Batches:
- Must use COUNTER keyword
- Cannot mix counter and non-counter statements in same batch
- No TTL allowed on counter updates
- No IF conditions allowed with counters
- Counter batches are always unlogged internally
Conditional Batches:
- All IF conditions must be on same partition
- Cannot use USING TIMESTAMP
- Cannot mix conditional and unconditional statements
General:
- Maximum batch size enforced by configuration
- Batch log adds overhead to logged batches
- Multi-partition batches not atomic if using UNLOGGED
Atomicity Scope¶
Batch Atomicity is Partition-Scoped
A critical misconception: batches are only atomic within a single partition at the storage layer.
| Batch Type | Single Partition | Multi-Partition |
|---|---|---|
| LOGGED | Atomic | Eventual atomicity via batch log |
| UNLOGGED | Atomic | Not atomic - partial execution possible |
| COUNTER | Atomic | Not atomic |
Multi-partition "atomicity" via logged batches:
- Batch log ensures eventual delivery, not instant atomicity
- Other reads may see partial results during batch execution
- If batch log replay fails repeatedly, mutations may be lost
-- This is NOT instantly atomic across partitions
BEGIN BATCH
INSERT INTO users (user_id, ...) VALUES ('user1', ...); -- Partition 1
INSERT INTO users (user_id, ...) VALUES ('user2', ...); -- Partition 2
APPLY BATCH;
-- A concurrent read might see user1 but not user2
Large Batch Performance Degradation¶
Large Batches Cause Performance Problems
Batches are NOT a performance optimization. Large batches cause severe issues:
| Issue | Impact |
|---|---|
| Coordinator memory | Entire batch held in memory until complete |
| GC pressure | Large batches trigger garbage collection |
| Batch log pressure | Logged batches write to batch log first |
| Timeout risk | Large batches more likely to timeout |
| Replay storms | Failed large batches cause replay overhead |
Configuration limits:
# cassandra.yaml
batch_size_warn_threshold_in_kb: 5 # Warn above 5KB
batch_size_fail_threshold_in_kb: 50 # Fail above 50KB
Warning signs in logs:
WARN Batch for [table] is of size 52KB, exceeding specified threshold of 5KB
Best practices:
- Keep batches small (< 5KB, ideally < 20 statements)
- Use batches for atomicity, not throughput
- For bulk loading, use parallel individual writes or SSTable loader
- Monitor
BatchMetricsin JMX for batch sizes
Examples¶
Denormalized Table Insert¶
-- Maintain consistency between users and users_by_email
BEGIN BATCH
INSERT INTO users (user_id, email, username, created_at)
VALUES (?, '[email protected]', 'alice', toTimestamp(now()));
INSERT INTO users_by_email (email, user_id, username)
VALUES ('[email protected]', ?, 'alice');
APPLY BATCH;
Same-Partition Event Logging¶
BEGIN UNLOGGED BATCH
INSERT INTO user_events (user_id, event_time, event_type, data)
VALUES (123, toTimestamp(now()), 'login', '{"ip": "192.168.1.1"}');
UPDATE user_stats SET login_count = login_count + 1
WHERE user_id = 123;
APPLY BATCH;
Conditional User Registration¶
BEGIN BATCH
INSERT INTO users (user_id, username, email)
VALUES (?, 'desired_username', '[email protected]')
IF NOT EXISTS;
INSERT INTO usernames (username, user_id)
VALUES ('desired_username', ?)
IF NOT EXISTS;
APPLY BATCH;
Counter Aggregation¶
BEGIN COUNTER BATCH
UPDATE daily_metrics SET page_views = page_views + 1
WHERE date = '2024-01-15' AND page = 'home';
UPDATE daily_metrics SET page_views = page_views + 1
WHERE date = '2024-01-15' AND page = 'about';
UPDATE total_metrics SET total_views = total_views + 2
WHERE metric_id = 'all_time';
APPLY BATCH;
Batch with Timestamp¶
BEGIN BATCH USING TIMESTAMP 1705315800000000
INSERT INTO events (event_id, type) VALUES (uuid(), 'imported');
INSERT INTO audit_log (log_id, action) VALUES (uuid(), 'data_import');
APPLY BATCH;
Related Documentation¶
- INSERT - Single row inserts
- UPDATE - Single row updates
- DELETE - Single row deletes
- Lightweight Transactions - Conditional batch details