Driver Best Practices¶

This page consolidates production configuration recommendations for Cassandra drivers.

Session Management¶

Single Session per Application¶

Create one session and reuse it throughout the application lifecycle:

// CORRECT: Single session, created once
public class CassandraConfig {
    private static CqlSession session;

    public static synchronized CqlSession getSession() {
        if (session == null) {
            session = CqlSession.builder()
                .withLocalDatacenter("dc1")
                .build();
        }
        return session;
    }

    public static void shutdown() {
        if (session != null) {
            session.close();
        }
    }
}

// WRONG: Session per request
public User getUser(UUID id) {
    try (CqlSession session = CqlSession.builder().build()) {  // Expensive!
        return session.execute(...);
    }
}

Aspect	Single Session	Session per Request
Connection overhead	Once at startup	Every request
Metadata discovery	Once	Every request
Prepared statement cache	Shared	Rebuilt each time
Resource usage	Predictable	Unbounded

Graceful Shutdown¶

Close the session cleanly on application shutdown:

Runtime.getRuntime().addShutdownHook(new Thread(() -> {
    session.close();  // Waits for in-flight requests
}));

Connection Configuration¶

Contact Points¶

Provide multiple contact points for initial connection:

CqlSession session = CqlSession.builder()
    .addContactPoint(new InetSocketAddress("10.0.1.1", 9042))
    .addContactPoint(new InetSocketAddress("10.0.1.2", 9042))
    .addContactPoint(new InetSocketAddress("10.0.1.3", 9042))
    .withLocalDatacenter("dc1")
    .build();

The driver only needs one successful connection to discover the full cluster topology, but multiple contact points provide redundancy during startup.

Local Datacenter¶

Always configure local datacenter explicitly in multi-DC deployments:

// REQUIRED for multi-DC
.withLocalDatacenter("dc1")

Failure to configure results in potential cross-DC routing with high latency.

Connection Pool Sizing¶

Default pool settings work for most workloads. Adjust only when:

Measured stream exhaustion occurs
Throughput exceeds tens of thousands requests/second per node
Monitoring shows pool-related bottlenecks

// Only if needed based on measurements
.withPoolingOptions(
    PoolingOptions.builder()
        .setConnectionsPerHost(DriverConnectionGroup.REMOTE, 1, 1)
        .setConnectionsPerHost(DriverConnectionGroup.LOCAL, 2, 4)
        .build())

Query Execution¶

Use Prepared Statements¶

Prepare all production queries:

// Prepare once at startup
private final PreparedStatement selectUser = session.prepare(
    "SELECT * FROM users WHERE user_id = ?");

// Execute with bound values
public User getUser(UUID userId) {
    Row row = session.execute(selectUser.bind(userId)).one();
    return mapToUser(row);
}

Benefits:

Reduced parsing overhead
Token-aware routing
Protection against CQL injection

Set Appropriate Consistency Levels¶

Choose consistency level based on requirements:

Statement statement = selectUser.bind(userId)
    .setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);  // Explicit

Use Case	Recommended CL
Strong consistency reads	LOCAL_QUORUM
Strong consistency writes	LOCAL_QUORUM
Eventually consistent reads	LOCAL_ONE
Analytics/reporting	ONE
Cross-DC consistency	QUORUM or EACH_QUORUM

Set Query Timeouts¶

Configure appropriate timeouts:

Statement statement = selectUser.bind(userId)
    .setTimeout(Duration.ofSeconds(5));  // Query-specific timeout

Timeout Type	Recommendation
Read timeout	5-10 seconds (longer than expected P99)
Write timeout	10-30 seconds (allow for hints, batches)
Connection timeout	5 seconds

Error Handling¶

Handle Specific Exceptions¶

try {
    session.execute(statement);
} catch (NoNodeAvailableException e) {
    // All nodes down - circuit breaker or fail
    log.error("Cluster unavailable", e);
    throw new ServiceUnavailableException();

} catch (QueryExecutionException e) {
    if (e instanceof ReadTimeoutException) {
        // Replica(s) didn't respond - may retry
        ReadTimeoutException rte = (ReadTimeoutException) e;
        log.warn("Read timeout: received {}/{} required",
            rte.getReceived(), rte.getRequired());

    } else if (e instanceof WriteTimeoutException) {
        // Write may or may not have succeeded
        WriteTimeoutException wte = (WriteTimeoutException) e;
        log.error("Write timeout for {}: received {}/{}",
            wte.getWriteType(), wte.getReceived(), wte.getRequired());
        // DO NOT retry non-idempotent writes automatically

    } else if (e instanceof UnavailableException) {
        // Not enough replicas alive
        UnavailableException ue = (UnavailableException) e;
        log.warn("Unavailable: alive {}/{} required",
            ue.getAlive(), ue.getRequired());
    }
}

Idempotency Marking¶

Mark idempotent operations explicitly:

// Safe to retry
Statement readStatement = selectUser.bind(userId)
    .setIdempotent(true);

// NOT safe to retry
Statement counterStatement = updateCounter.bind(pageId)
    .setIdempotent(false);

Policy Configuration¶

Production Policy Template¶

CqlSession session = CqlSession.builder()
    .addContactPoints(contactPoints)
    .withLocalDatacenter("dc1")

    // Load balancing: token-aware with DC awareness
    .withLoadBalancingPolicy(
        DefaultLoadBalancingPolicy.builder()
            .withLocalDatacenter("dc1")
            .build())

    // Retry: conservative, respects idempotency
    .withRetryPolicy(DefaultRetryPolicy.INSTANCE)

    // Reconnection: exponential backoff
    .withReconnectionPolicy(
        ExponentialReconnectionPolicy.builder()
            .withBaseDelay(Duration.ofSeconds(1))
            .withMaxDelay(Duration.ofMinutes(5))
            .build())

    // Speculative execution: disabled by default
    // Enable only for idempotent, latency-sensitive queries
    // .withSpeculativeExecutionPolicy(...)

    .build();

Per-Query Policy Override¶

Override policies for specific query types:

// Latency-sensitive read with speculative execution
Statement fastRead = selectUser.bind(userId)
    .setIdempotent(true)
    .setSpeculativeExecutionPolicy(speculativePolicy);

// Non-idempotent write with no retry
Statement counterUpdate = incrementCounter.bind(pageId)
    .setIdempotent(false)
    .setRetryPolicy(FallthroughRetryPolicy.INSTANCE);

Monitoring¶

Essential Metrics¶

Monitor these driver metrics:

Metric Category	Key Metrics
Latency	Request latency percentiles (P50, P95, P99)
Throughput	Requests per second
Errors	Error rate by type (timeout, unavailable, etc.)
Connections	Open connections per node
Pool	In-flight requests, available streams
Retries	Retry rate, retry success rate
Speculative	Trigger rate, win rate

Health Checks¶

Implement application health checks:

public boolean isHealthy() {
    try {
        // Simple query to verify connectivity
        session.execute("SELECT now() FROM system.local");
        return true;
    } catch (Exception e) {
        return false;
    }
}

Logging¶

Configure appropriate driver logging:

<!-- Log connection events -->
<logger name="com.datastax.oss.driver.internal.core.pool" level="INFO"/>

<!-- Log retries and speculative execution -->
<logger name="com.datastax.oss.driver.internal.core.retry" level="DEBUG"/>

<!-- Reduce noise from metadata refresh -->
<logger name="com.datastax.oss.driver.internal.core.metadata" level="WARN"/>

Common Anti-Patterns¶

Anti-Pattern	Problem	Solution
Session per request	Massive overhead	Single shared session
Unprepared statements in loops	Parsing overhead, no token-aware	Prepare and reuse
Ignoring local datacenter	Cross-DC latency	Configure explicitly
Retrying non-idempotent writes	Data corruption	Mark idempotency, custom retry
Unbounded IN clauses	Prepared statement cache churn	Fixed sizes or pagination
Synchronous calls in async context	Thread pool exhaustion	Use async API consistently
No timeout configuration	Requests hang indefinitely	Set explicit timeouts
Catching generic Exception	Hides specific error handling	Catch specific exceptions

Checklist¶

Before deploying to production:

[ ] Single session instance shared across application
[ ] Local datacenter configured explicitly
[ ] All queries use prepared statements
[ ] Consistency levels set explicitly
[ ] Timeouts configured appropriately
[ ] Idempotent operations marked
[ ] Error handling for specific exception types
[ ] Driver metrics exported to monitoring
[ ] Health check endpoint implemented
[ ] Graceful shutdown configured
[ ] Connection pool sized appropriately (if non-default)
[ ] Retry policy reviewed for workload
[ ] Speculative execution evaluated (if latency-sensitive)

Connection Management — Connection pooling details
Policies — Policy configuration reference
Prepared Statements — Statement preparation and caching