cassandra-stress¶
The official Cassandra benchmarking and load testing tool.
Overview¶
cassandra-stress is a Java-based tool for: - Benchmarking cluster performance - Load testing before production - Capacity planning - Regression testing
Basic Commands¶
Write Test¶
# Insert 1 million rows
cassandra-stress write n=1000000
# With specific thread count
cassandra-stress write n=1000000 -rate threads=50
# With consistency level
cassandra-stress write n=1000000 cl=LOCAL_QUORUM
# Against specific nodes
cassandra-stress write n=1000000 -node 192.168.1.10,192.168.1.11
Read Test¶
# Read 1 million rows (requires prior write)
cassandra-stress read n=1000000 -rate threads=50
# No warmup
cassandra-stress read n=1000000 no-warmup
Mixed Workload¶
# 50% read, 50% write
cassandra-stress mixed ratio\(write=1,read=1\) n=1000000
# 70% read, 30% write
cassandra-stress mixed ratio\(read=7,write=3\) n=1000000
# Duration-based
cassandra-stress mixed ratio\(read=7,write=3\) duration=10m
Connection Options¶
Target Nodes¶
-node 192.168.1.10
-node 192.168.1.10,192.168.1.11,192.168.1.12
Authentication¶
-mode native cql3 user=cassandra password=cassandra
SSL/TLS¶
-transport "truststore=/path/truststore.jks truststore-password=pass"
# Full SSL config
-transport "truststore=/path/truststore.jks truststore-password=pass keystore=/path/keystore.jks keystore-password=pass"
CQL Protocol¶
-mode native cql3
-port native=9042
Rate Limiting¶
Thread-Based¶
# Fixed thread count
-rate threads=100
# Thread range (auto-tune)
-rate threads>=50 threads<=200
Throughput-Based¶
# Target ops/sec
-rate threads=50 throttle=10000/s
# Fixed rate
-rate "fixed=5000/s"
Custom Schema¶
YAML Profile¶
# user_profile.yaml
keyspace: stress_test
keyspace_definition: |
CREATE KEYSPACE stress_test WITH replication = {
'class': 'NetworkTopologyStrategy',
'dc1': 3
};
table: users
table_definition: |
CREATE TABLE users (
user_id uuid,
username text,
email text,
created_at timestamp,
profile_data blob,
PRIMARY KEY (user_id)
)
columnspec:
- name: user_id
size: fixed(36)
population: uniform(1..10000000)
- name: username
size: gaussian(5..20)
population: uniform(1..10000000)
- name: email
size: gaussian(15..50)
- name: created_at
cluster: fixed(1)
- name: profile_data
size: gaussian(100..1000)
insert:
partitions: fixed(1)
batchtype: UNLOGGED
queries:
read_user:
cql: SELECT * FROM users WHERE user_id = ?
fields: samerow
read_username:
cql: SELECT username, email FROM users WHERE user_id = ?
fields: samerow
Run with Profile¶
# Insert data
cassandra-stress user profile=user_profile.yaml \
ops\(insert=1\) n=1000000
# Mixed operations
cassandra-stress user profile=user_profile.yaml \
ops\(insert=1,read_user=3\) duration=30m
# Specific query
cassandra-stress user profile=user_profile.yaml \
ops\(read_user=1\) n=500000
Column Specifications¶
Size Distributions¶
columnspec:
# Fixed size
- name: id
size: fixed(36)
# Gaussian distribution
- name: data
size: gaussian(100..500) # mean ~300
# Uniform distribution
- name: content
size: uniform(50..200)
# Exponential distribution
- name: blob
size: exp(100..10000)
Population Distributions¶
columnspec:
- name: user_id
population: uniform(1..1000000)
- name: partition_key
population: gaussian(1..100000)
# Sequence (incremental)
- name: seq_id
population: seq(1..10000000)
Output and Logging¶
Log to File¶
cassandra-stress write n=1000000 -log file=stress.log
Graph Output¶
cassandra-stress write n=1000000 -graph file=results.html title="Write Test"
Interval Reporting¶
# Report every 5 seconds
cassandra-stress write n=1000000 -log interval=5
Understanding Results¶
Key Metrics¶
Results:
Op rate : 45,231 op/s # Operations per second
Partition rate: 45,231 pk/s # Partitions per second
Row rate : 45,231 row/s # Rows per second
Latency mean : 4.4 ms # Average latency
Latency median: 2.1 ms # 50th percentile
Latency 95th : 12.3 ms # 95th percentile
Latency 99th : 35.2 ms # 99th percentile
Latency max : 245.1 ms # Maximum observed
Total errors : 0 # Error count
Performance Guidelines¶
| Metric | Good | Warning | Bad |
|---|---|---|---|
| p95 latency | < 20ms | 20-50ms | > 50ms |
| p99 latency | < 50ms | 50-100ms | > 100ms |
| Error rate | 0% | < 0.1% | > 0.1% |
Counter Operations¶
# Counter writes
cassandra-stress counter_write n=1000000 -rate threads=50
# Counter reads
cassandra-stress counter_read n=1000000
Advanced Examples¶
Warm-Up Then Test¶
# Warm-up phase
cassandra-stress write n=100000 -rate threads=10
# Actual test
cassandra-stress write n=5000000 -rate threads=100
Multiple DCs¶
cassandra-stress write n=1000000 \
-node dc1-node1,dc1-node2 \
cl=LOCAL_QUORUM \
-rate threads=100
Compaction Stress Test¶
# Heavy writes to trigger compaction
cassandra-stress write n=10000000 \
-rate threads=200 \
-schema "replication(strategy=NetworkTopologyStrategy,dc1=3)" \
-log interval=10
Time-Series Workload¶
# timeseries_profile.yaml
keyspace: metrics
table: sensor_data
table_definition: |
CREATE TABLE sensor_data (
sensor_id text,
bucket text,
ts timestamp,
value double,
PRIMARY KEY ((sensor_id, bucket), ts)
) WITH CLUSTERING ORDER BY (ts DESC)
columnspec:
- name: sensor_id
size: fixed(10)
population: uniform(1..1000)
- name: bucket
size: fixed(10)
- name: ts
cluster: uniform(1..1000)
- name: value
population: gaussian(0..100)
Troubleshooting¶
Connection Errors¶
# Verify connectivity
cassandra-stress write n=1 -node 192.168.1.10
# Check native transport
nodetool status
Out of Memory¶
# Increase stress tool heap
export JVM_OPTS="-Xms4G -Xmx4G"
cassandra-stress write n=10000000
Throttling Issues¶
# Reduce thread count
-rate threads=25
# Add throttle limit
-rate threads=50 throttle=5000/s
Next Steps¶
- Benchmarking - Benchmarking guide
- Performance - Performance tuning
- Monitoring - Monitor during tests