Skip to content

BATCH

The BATCH statement groups multiple INSERT, UPDATE, and DELETE statements into a single atomic operation. Batches ensure that all statements either complete or none do, providing atomicity guarantees. However, batches are frequently misused as a performance optimization—they are not.


Behavioral Guarantees

What BATCH Guarantees

  • All statements in a logged batch eventually execute or none execute
  • Statements to the same partition are atomic at the storage layer
  • Logged batches write to batch log before executing mutations
  • USING TIMESTAMP applies the same timestamp to all statements
  • If any IF condition fails, no statements in the batch execute

What BATCH Does NOT Guarantee

Undefined Behavior

The following behaviors are undefined and must not be relied upon:

  • Isolation: Other reads may see partial batch results before completion
  • Performance improvement: Batches do not improve throughput over parallel writes
  • Cross-datacenter atomicity: Logged batches provide atomicity within a datacenter, not globally
  • Unlogged batch atomicity: UNLOGGED batches have no atomicity guarantee on coordinator failure
  • Order of execution: Statements within a batch may execute in any order

Logged vs Unlogged Contracts

Aspect Logged Batch Unlogged Batch
Coordinator failure Recovers via batch log Undefined - partial execution possible
Multi-partition Atomic (via batch log) Not atomic
Same-partition Atomic Atomic (storage layer)
Performance Higher latency (batch log overhead) Lower latency

Failure Semantics

Failure Mode Logged Batch Unlogged Batch
Coordinator fails after batch log write Batch log replays mutations Partial execution possible
Coordinator fails before batch log write Not applied Not applied
WriteTimeoutException May have been logged for replay Undefined
UnavailableException Not applied Not applied

Version-Specific Behavior

Version Behavior
1.2+ Basic BATCH support
2.0+ UNLOGGED BATCH, improved batch log (CASSANDRA-4542)
2.1+ COUNTER BATCH separated from regular batches
3.0+ Improved batch log performance (CASSANDRA-9673)
4.0+ Enhanced multi-partition batch handling

Overview

What BATCH Is For

BATCH provides atomicity, not performance:

Use Case Example Result
Correct: Atomicity INSERT INTO users ...
INSERT INTO users_by_email ...
Both succeed or both fail ✓
Wrong: 'Performance' INSERT INTO table_a ...
INSERT INTO table_b ...
INSERT INTO table_c ...
Slower than individual inserts! ✗

Historical Context

BATCH was introduced in CQL 3.0 (Cassandra 1.2) to address the problem of maintaining consistency across denormalized tables:

Version Feature
1.2 Basic BATCH with atomicity
2.0 UNLOGGED BATCH, batch log improvements
2.1 COUNTER BATCH separated
3.0 Improved batch log efficiency
4.0 Better multi-partition batch handling

Synopsis

BEGIN [ UNLOGGED | COUNTER ] BATCH
    [ USING TIMESTAMP *microseconds* ]
    *dml_statement* ;
    [ *dml_statement* ; ... ]
APPLY BATCH

dml_statement:

INSERT ... | UPDATE ... | DELETE ...

Batch Types

Logged Batch (Default)

BEGIN BATCH
    INSERT INTO users (user_id, username) VALUES (?, 'alice');
    INSERT INTO usernames (username, user_id) VALUES ('alice', ?);
APPLY BATCH;

Execution:

uml diagram

Batch Log Architecture

The batch log ensures durability by writing to multiple nodes before executing mutations:

  1. Coordinator selects batch log endpoints: Two nodes in the local datacenter (chosen to minimize latency)
  2. Batch log written: The serialized batch is written to each endpoint's local batch log table
  3. Mutations executed: After batch log is durable, mutations are sent to replicas
  4. Batch log removed: After all mutations acknowledged, batch log entries are deleted

Batch Log Storage

The batch log uses LocalStrategy (RF=1), meaning each node stores only its own batch log entries. Durability comes from writing to multiple endpoints, not from replication.

Batch Log Table Versions

The batch log implementation has evolved across Cassandra versions:

Version Table Key Changes
< 2.2 system.batchlog Original implementation
2.2+ system.batches New table format, improved performance
3.0+ system.batches Batch log replay improvements
4.0+ system.batches Enhanced timeout handling
-- View pending batches (should normally be empty)
SELECT * FROM system.batches;

-- Columns vary by version, but typically include:
-- id (timeuuid), version (int), written_at (timestamp), data (blob)

Batch Log Replay

Batch log entries are replayed automatically:

  • Each node periodically scans its local batch log
  • Entries older than batchlog_replay_throttle are candidates for replay
  • Mutations are re-executed to ensure completion
  • Successfully replayed entries are removed
# cassandra.yaml
batchlog_replay_throttle_in_kb: 1024  # Throttle replay bandwidth

Guarantees:

  • All statements eventually execute or none do
  • Coordinator failure doesn't lose the batch (batch log on other nodes)
  • Batch log written to 2 endpoints for redundancy

Cost:

  • Additional writes to 2 batch log endpoints
  • Higher latency than unlogged (must wait for batch log durability)
  • Increased coordinator memory usage

Unlogged Batch

BEGIN UNLOGGED BATCH
    UPDATE user_profile SET name = 'Alice' WHERE user_id = ?;
    UPDATE user_profile SET email = '[email protected]' WHERE user_id = ?;
    UPDATE user_profile SET updated_at = toTimestamp(now()) WHERE user_id = ?;
APPLY BATCH;

Execution:

  • No batch log write
  • Statements sent directly to replicas
  • No recovery if coordinator fails

Use when:

  • All statements target the same partition
  • Atomicity across coordinator failure not required
  • Lower latency needed

Counter Batch

BEGIN COUNTER BATCH
    UPDATE page_stats SET views = views + 1 WHERE page_id = 'home';
    UPDATE page_stats SET views = views + 1 WHERE page_id = 'about';
    UPDATE daily_stats SET requests = requests + 2 WHERE date = '2024-01-15';
APPLY BATCH;

Restrictions:

  • Can only contain counter updates
  • Cannot mix counter and non-counter statements
  • No TTL allowed

Parameters

USING TIMESTAMP

Applies a single timestamp to all statements:

BEGIN BATCH USING TIMESTAMP 1705315800000000
    INSERT INTO table1 (id, data) VALUES (1, 'a');
    INSERT INTO table2 (id, data) VALUES (2, 'b');
APPLY BATCH;

Behavior:

  • All mutations share the same timestamp
  • Individual statements cannot override
  • Cannot be used with LWT (IF conditions)

Same-Partition vs Multi-Partition Batches

All statements target the same partition:

-- Good: Single partition batch
BEGIN UNLOGGED BATCH
    INSERT INTO user_events (user_id, event_id, type) VALUES (123, uuid(), 'login');
    INSERT INTO user_events (user_id, event_id, type) VALUES (123, uuid(), 'page_view');
    INSERT INTO user_events (user_id, event_id, type) VALUES (123, uuid(), 'click');
APPLY BATCH;

Benefits:

  • Single coordinator to single replica set
  • Atomic at storage layer (same SSTable)
  • Minimal coordination overhead

Multi-Partition Batch (Use Carefully)

Statements target different partitions:

-- Acceptable: Denormalized tables that must stay consistent
BEGIN BATCH
    INSERT INTO users (user_id, email) VALUES (?, '[email protected]');
    INSERT INTO users_by_email (email, user_id) VALUES ('[email protected]', ?);
APPLY BATCH;

uml diagram

Costs:

  • Coordinator must contact multiple nodes
  • Batch log adds latency
  • Memory pressure on coordinator

Batch Anti-Patterns

Anti-Pattern 1: Batching for Performance

-- WRONG: Batching unrelated writes doesn't improve performance
BEGIN BATCH
    INSERT INTO users (user_id, name) VALUES (1, 'Alice');
    INSERT INTO users (user_id, name) VALUES (2, 'Bob');
    INSERT INTO users (user_id, name) VALUES (3, 'Charlie');
    -- ... 100 more unrelated inserts
APPLY BATCH;

Why it's slow:

  • Coordinator must track all mutations in memory
  • Single point of coordination
  • Larger network payload than parallel requests

Better approach:

// Parallel async inserts
List<CompletionStage<AsyncResultSet>> futures = new ArrayList<>();
for (User user : users) {
    futures.add(session.executeAsync(insertStmt.bind(user)));
}
CompletableFuture.allOf(futures.toArray()).join();

Anti-Pattern 2: Large Batches

-- WRONG: Batch too large
BEGIN BATCH
    -- 1000 INSERT statements
APPLY BATCH;

Problems:

  • Exceeds batch size thresholds
  • Coordinator memory exhaustion
  • Timeout likelihood increases

Thresholds (cassandra.yaml):

batch_size_warn_threshold_in_kb: 5
batch_size_fail_threshold_in_kb: 50

Anti-Pattern 3: Batching Different Tables Without Need

-- WRONG: No atomicity requirement
BEGIN BATCH
    INSERT INTO audit_log (id, action) VALUES (uuid(), 'user_created');
    INSERT INTO metrics (id, count) VALUES ('users', 1);
    INSERT INTO notifications (id, message) VALUES (uuid(), 'Welcome!');
APPLY BATCH;

Better approach:

Execute independently—if one fails, others can still succeed.


Conditional Batches (LWT)

Batches can include lightweight transaction conditions:

BEGIN BATCH
    INSERT INTO users (user_id, username) VALUES (?, 'alice') IF NOT EXISTS;
    INSERT INTO usernames (username, user_id) VALUES ('alice', ?) IF NOT EXISTS;
APPLY BATCH;

LWT Batch Behavior

Important: When ANY statement has an IF condition, ALL statements use Paxos:

uml diagram

Key points:

  • Entire batch is all-or-nothing
  • One failed condition aborts all statements
  • Significant performance impact

Restrictions

  • Cannot mix conditional and unconditional statements
  • All conditions must be on same partition
  • Cannot use USING TIMESTAMP with IF

Batch Size Limits

Configuration

# cassandra.yaml
batch_size_warn_threshold_in_kb: 5     # Log warning
batch_size_fail_threshold_in_kb: 50    # Reject batch

Monitoring

# Check for batch size warnings
grep "Batch" /var/log/cassandra/system.log

# Metrics
nodetool tablestats system.batches

Calculating Batch Size

Approximate formula:

batch_size ≈ sum(mutation_sizes) + overhead
mutation_size ≈ key_size + sum(column_sizes) + metadata

Best Practices

Do Use Batches For

Good Batch Use Cases

  1. Denormalized table consistency: Keep related tables in sync
  2. Same-partition atomicity: Multiple writes to one partition
  3. Conditional group operations: LWT across related rows

Don't Use Batches For

Batch Anti-Patterns

  1. Performance optimization: Use async parallel writes instead
  2. Large bulk loads: Use SSTable loader or async writes
  3. Unrelated writes: No atomicity needed
  4. Cross-datacenter atomicity: Batches don't provide this

Sizing Guidelines

Batch Type Recommended Size
Same-partition Up to 100 statements
Multi-partition 2-5 partitions
With LWT 2-3 statements

Monitoring

-- Check batch log table
SELECT * FROM system.batches;

-- Should typically be empty (batches complete quickly)

Restrictions

Restrictions

Statement Types:

  • Only INSERT, UPDATE, DELETE allowed
  • Cannot include SELECT statements
  • Cannot include DDL statements

Counter Batches:

  • Must use COUNTER keyword
  • Cannot mix counter and non-counter statements in same batch
  • No TTL allowed on counter updates
  • No IF conditions allowed with counters
  • Counter batches are always unlogged internally

Conditional Batches:

  • All IF conditions must be on same partition
  • Cannot use USING TIMESTAMP
  • Cannot mix conditional and unconditional statements

General:

  • Maximum batch size enforced by configuration
  • Batch log adds overhead to logged batches
  • Multi-partition batches not atomic if using UNLOGGED

Atomicity Scope

Batch Atomicity is Partition-Scoped

A critical misconception: batches are only atomic within a single partition at the storage layer.

Batch Type Single Partition Multi-Partition
LOGGED Atomic Eventual atomicity via batch log
UNLOGGED Atomic Not atomic - partial execution possible
COUNTER Atomic Not atomic

Multi-partition "atomicity" via logged batches:

  • Batch log ensures eventual delivery, not instant atomicity
  • Other reads may see partial results during batch execution
  • If batch log replay fails repeatedly, mutations may be lost
-- This is NOT instantly atomic across partitions
BEGIN BATCH
    INSERT INTO users (user_id, ...) VALUES ('user1', ...);    -- Partition 1
    INSERT INTO users (user_id, ...) VALUES ('user2', ...);    -- Partition 2
APPLY BATCH;
-- A concurrent read might see user1 but not user2

Large Batch Performance Degradation

Large Batches Cause Performance Problems

Batches are NOT a performance optimization. Large batches cause severe issues:

Issue Impact
Coordinator memory Entire batch held in memory until complete
GC pressure Large batches trigger garbage collection
Batch log pressure Logged batches write to batch log first
Timeout risk Large batches more likely to timeout
Replay storms Failed large batches cause replay overhead

Configuration limits:

# cassandra.yaml
batch_size_warn_threshold_in_kb: 5     # Warn above 5KB
batch_size_fail_threshold_in_kb: 50    # Fail above 50KB

Warning signs in logs:

WARN  Batch for [table] is of size 52KB, exceeding specified threshold of 5KB

Best practices:

  • Keep batches small (< 5KB, ideally < 20 statements)
  • Use batches for atomicity, not throughput
  • For bulk loading, use parallel individual writes or SSTable loader
  • Monitor BatchMetrics in JMX for batch sizes

Examples

Denormalized Table Insert

-- Maintain consistency between users and users_by_email
BEGIN BATCH
    INSERT INTO users (user_id, email, username, created_at)
    VALUES (?, '[email protected]', 'alice', toTimestamp(now()));

    INSERT INTO users_by_email (email, user_id, username)
    VALUES ('[email protected]', ?, 'alice');
APPLY BATCH;

Same-Partition Event Logging

BEGIN UNLOGGED BATCH
    INSERT INTO user_events (user_id, event_time, event_type, data)
    VALUES (123, toTimestamp(now()), 'login', '{"ip": "192.168.1.1"}');

    UPDATE user_stats SET login_count = login_count + 1
    WHERE user_id = 123;
APPLY BATCH;

Conditional User Registration

BEGIN BATCH
    INSERT INTO users (user_id, username, email)
    VALUES (?, 'desired_username', '[email protected]')
    IF NOT EXISTS;

    INSERT INTO usernames (username, user_id)
    VALUES ('desired_username', ?)
    IF NOT EXISTS;
APPLY BATCH;

Counter Aggregation

BEGIN COUNTER BATCH
    UPDATE daily_metrics SET page_views = page_views + 1
    WHERE date = '2024-01-15' AND page = 'home';

    UPDATE daily_metrics SET page_views = page_views + 1
    WHERE date = '2024-01-15' AND page = 'about';

    UPDATE total_metrics SET total_views = total_views + 2
    WHERE metric_id = 'all_time';
APPLY BATCH;

Batch with Timestamp

BEGIN BATCH USING TIMESTAMP 1705315800000000
    INSERT INTO events (event_id, type) VALUES (uuid(), 'imported');
    INSERT INTO audit_log (log_id, action) VALUES (uuid(), 'data_import');
APPLY BATCH;