Cassandra Anti-Patterns¶
These patterns work fine in development and testing. They might even work in early production. Then data grows, traffic increases, and they fail—usually in ways that are not obvious without understanding Cassandra internals.
Unbounded partitions grow until reads timeout. Tombstone accumulation makes queries slower over time. ALLOW FILTERING queries that took milliseconds start taking minutes. Queue patterns create tombstones faster than compaction can remove them.
The fix is usually straightforward once the problem is understood. This guide covers each anti-pattern: why it seems reasonable, how it fails, and how to fix it.
Anti-Pattern Severity Reference¶
| Anti-Pattern | Severity | Time to Impact | Symptoms |
|---|---|---|---|
| Unbounded partitions | Critical | Weeks-Months | Timeouts, OOM, repair failures |
| Tombstone accumulation | Critical | Days-Weeks | Increasing read latency, query failures |
| ALLOW FILTERING | Critical | Immediate | Full cluster scan, timeouts |
| Queue pattern | High | Days | Tombstones, hot partitions |
| Secondary index abuse | High | Immediate | All-node queries, high latency |
| Large IN clauses | High | Immediate | Coordinator overload |
| Collection abuse | Medium | Weeks | Memory pressure, slow writes |
| Counter misuse | Medium | Immediate | Errors, data inconsistency |
| Over-normalization | Medium | Immediate | Multiple queries per operation |
Unbounded Partitions (Critical)¶
The Problem¶
Partitions that grow without limit eventually exceed what Cassandra can handle efficiently.
-- ANTI-PATTERN: Partition grows forever
CREATE TABLE user_activity (
user_id UUID,
activity_time TIMESTAMP,
activity_type TEXT,
details TEXT,
PRIMARY KEY ((user_id), activity_time)
);
What happens over time:
Month 1: 1,000 rows per user ~100 KB per partition ✓
Month 6: 50,000 rows per user ~5 MB per partition ✓
Year 1: 500,000 rows per user ~50 MB per partition ⚠
Year 3: 5,000,000 rows per user ~500 MB per partition ✗
Year 5: 15,000,000 rows per user ~1.5 GB per partition ☠
What Goes Wrong¶
| Stage | Partition Size | Symptoms |
|---|---|---|
| 1. Read Performance Degrades | > 100 MB | Queries slow down. "Get latest activity" now scans gigabytes to find recent rows. Bloom filters help but not enough. |
| 2. Compaction Problems | > 500 MB | Compaction must load entire partition into memory. Compaction threads consume heap, leading to GC pauses. Other operations stall waiting for compaction to complete. |
| 3. Repair Failures | > 1 GB | Repair streams entire partitions between nodes. Streaming 1GB over network causes timeouts. Repair fails repeatedly. Data consistency cannot be maintained. |
| 4. Cluster Instability | > 2 GB | Nodes hosting these partitions become unresponsive. Hints queue up for unavailable nodes. Cascade failures begin. Full cluster becomes unstable. |
Detection¶
# Check partition size distribution
nodetool tablehistograms keyspace.user_activity
# Warning signs in output:
# 99th percentile partition size > 100MB
# Max partition size > 500MB
# Check for large partition warnings in logs
grep "Compacting large partition" /var/log/cassandra/system.log
grep "Writing large partition" /var/log/cassandra/system.log
Solution: Time Bucketing¶
-- CORRECT: Bounded partitions by time bucket
CREATE TABLE user_activity (
user_id UUID,
month TEXT, -- '2024-01'
activity_time TIMESTAMP,
activity_type TEXT,
details TEXT,
PRIMARY KEY ((user_id, month), activity_time)
) WITH CLUSTERING ORDER BY (activity_time DESC);
Now each partition covers one month:
user_id=alice, month='2024-01': 40,000 rows, ~4 MB ✓
user_id=alice, month='2024-02': 38,000 rows, ~3.8 MB ✓
user_id=alice, month='2024-03': 42,000 rows, ~4.2 MB ✓
Migration Strategy¶
For existing unbounded partitions:
-- 1. Create new time-bucketed table
CREATE TABLE user_activity_v2 (...);
-- 2. Migrate data in application
-- Read from old table, write to new with bucket
-- 3. Dual-write during transition
-- Write to both tables
-- 4. Switch reads to new table
-- 5. Drop old table after validation
Tombstone Accumulation (Critical)¶
The Problem¶
Every DELETE in Cassandra creates a tombstone—a marker that says "this data was deleted." Tombstones must be scanned during reads until they're purged by compaction.
-- ANTI-PATTERN: High-frequency deletes
CREATE TABLE message_queue (
queue_id TEXT,
message_id TIMEUUID,
payload TEXT,
PRIMARY KEY ((queue_id), message_id)
);
-- Application pattern
INSERT INTO message_queue (...) VALUES (...);
-- Process message
DELETE FROM message_queue WHERE queue_id = ? AND message_id = ?;
-- Repeat thousands of times per hour
What Goes Wrong¶
Day 1: 100 live rows, 1,000 tombstones
Read scans 1,100 entries to return 100 rows
Query time: 5ms
Day 7: 100 live rows, 50,000 tombstones
Read scans 50,100 entries to return 100 rows
Query time: 200ms
Day 30: 100 live rows, 500,000 tombstones
Read scans 500,100 entries to return 100 rows
Query time: 2,000ms → TIMEOUT
Eventually: TombstoneOverwhelmingException
Query fails at tombstone_failure_threshold (default 100,000)
Why Tombstones Persist¶
Tombstones cannot be removed until:
1. gc_grace_seconds has passed (default 10 days)
2. All replicas have the tombstone (ensured by repair)
3. Compaction runs and merges SSTables containing the tombstone
| Day | Event | Notes |
|---|---|---|
| 0 | DELETE executed | Tombstone created |
| 1-9 | Tombstone must stay | gc_grace_seconds = 10 days |
| 10 | Tombstone eligible for removal | Requires compaction to actually remove |
| ?? | Compaction runs | Tombstone finally removed |
Tombstone Removal Risks
- If repair has not run: Tombstone removal is unsafe (data resurrection risk)
- If compaction is behind: Tombstone persists indefinitely
Detection¶
-- Enable query tracing
TRACING ON;
SELECT * FROM message_queue WHERE queue_id = 'orders';
-- Look for in trace output:
-- "Read 100 live rows and 50000 tombstone cells"
# Check tombstone metrics
nodetool tablestats keyspace.message_queue | grep -i tombstone
# Check for warnings in logs
grep "tombstone" /var/log/cassandra/system.log
grep "TombstoneOverwhelmingException" /var/log/cassandra/system.log
# cassandra.yaml thresholds
tombstone_warn_threshold: 1000 # Log warning
tombstone_failure_threshold: 100000 # Fail query
Solutions¶
Solution 1: Use TTL instead of DELETE
-- Let data expire automatically
INSERT INTO message_queue (queue_id, message_id, payload)
VALUES (?, now(), ?)
USING TTL 3600; -- Expires after 1 hour
-- No explicit DELETE needed
-- TTL creates tombstones too, but predictably
Solution 2: Time-windowed tables with TWCS
CREATE TABLE message_queue (
queue_id TEXT,
hour TIMESTAMP,
message_id TIMEUUID,
payload TEXT,
PRIMARY KEY ((queue_id, hour), message_id)
) WITH compaction = {
'class': 'TimeWindowCompactionStrategy',
'compaction_window_size': '1',
'compaction_window_unit': 'HOURS'
}
AND default_time_to_live = 86400
AND gc_grace_seconds = 3600;
-- TWCS drops entire SSTables when all data expires
-- No tombstone accumulation
Solution 3: Reduce gc_grace_seconds (with frequent repair)
-- Only if repair runs more frequently than gc_grace_seconds
ALTER TABLE message_queue WITH gc_grace_seconds = 86400; -- 1 day
-- MUST run repair daily or more frequently
-- Otherwise: data resurrection risk
Solution 4: Range deletes for batch cleanup
-- Instead of individual deletes
-- Accumulate work, then delete entire partition
-- Delete all messages older than 1 hour (partition delete)
DELETE FROM message_queue
WHERE queue_id = 'orders'
AND message_id < minTimeuuid('2024-01-15 14:00:00');
ALLOW FILTERING (Critical)¶
The Problem¶
ALLOW FILTERING tells Cassandra to scan all partitions in a table, filtering results in memory.
-- ANTI-PATTERN: Query without partition key
SELECT * FROM users WHERE country = 'US' ALLOW FILTERING;
What Actually Happens¶
For SELECT * FROM users WHERE country = 'US' ALLOW FILTERING:
| Step | Action |
|---|---|
| 1 | Coordinator receives query |
| 2 | Coordinator sends request to all nodes: "Scan ALL your partitions, filter by country" |
| 3 | Each node reads every SSTable, deserializes every row, checks if country = 'US' |
| 4 | Each node returns matching rows to coordinator |
| 5 | Coordinator aggregates and returns results |
Complexity: O(n) where n = total rows in table
| Table Size | Rows Scanned |
|---|---|
| 1 million users | 1 million rows |
| 10 million users | 10 million rows |
| 100 million users | 100 million rows |
When ALLOW FILTERING Is Acceptable¶
In very limited cases:
-- OK: Small table (< 10,000 rows), infrequent queries
SELECT * FROM configuration WHERE feature = 'dark_mode' ALLOW FILTERING;
-- OK: Already constrained to single partition
SELECT * FROM user_orders
WHERE user_id = '550e8400-e89b-12d3-a456-426614174000'
AND status = 'pending'
ALLOW FILTERING;
-- Only scans one partition, not entire table
Solutions¶
Solution 1: Create a table for the query
-- Instead of: SELECT * FROM users WHERE country = 'US' ALLOW FILTERING
-- Create dedicated table
CREATE TABLE users_by_country (
country TEXT,
user_id UUID,
username TEXT,
email TEXT,
PRIMARY KEY ((country), user_id)
);
-- Query without ALLOW FILTERING
SELECT * FROM users_by_country WHERE country = 'US' LIMIT 100;
Solution 2: Secondary index (low-cardinality only)
-- For low-cardinality columns (< 100 unique values)
CREATE INDEX ON users (status); -- active/inactive/pending
-- Query uses index
SELECT * FROM users WHERE status = 'active';
-- WARNING: Still queries all nodes, but more efficiently
Solution 3: SAI (Cassandra 5.0+)
-- Storage-Attached Index for high-cardinality
CREATE CUSTOM INDEX ON users (email)
USING 'StorageAttachedIndex';
SELECT * FROM users WHERE email = '[email protected]';
Queue Anti-Pattern (High)¶
The Problem¶
Using Cassandra as a message queue combines multiple anti-patterns: tombstone accumulation, hot partitions, and race conditions.
-- ANTI-PATTERN: Queue table
CREATE TABLE job_queue (
queue_name TEXT,
job_id TIMEUUID,
payload TEXT,
status TEXT,
PRIMARY KEY ((queue_name), job_id)
);
-- Worker pattern
WHILE true:
-- Poll for jobs
SELECT * FROM job_queue
WHERE queue_name = 'processing'
LIMIT 10;
-- Process job
-- Delete completed job
DELETE FROM job_queue
WHERE queue_name = 'processing' AND job_id = ?;
What Goes Wrong¶
| Problem | Description |
|---|---|
| 1. Tombstone Accumulation | Every DELETE creates a tombstone. After 10,000 jobs: 10 live rows, 10,000 tombstones. SELECT scans 10,010 entries to return 10 rows. |
| 2. Hot Partition | All workers read from same partition (queue_name = 'processing'). One partition handles 100% of queue traffic. Node hosting this partition is overwhelmed while others idle. |
| 3. Race Conditions | Multiple workers poll simultaneously. Worker A and Worker B both read job_id = 123. Both process the same job → data corruption or duplicate processing. |
| 4. No Ordering Guarantees | Cassandra's eventual consistency means DELETE might not be visible immediately, job might be re-read before delete propagates, and FIFO ordering is not guaranteed. |
Solution: Use a Real Message Queue¶
Cassandra is not a message queue. Use: - Apache Kafka - High-throughput streaming - RabbitMQ - Traditional message broker - Amazon SQS - Managed queue service - Redis Streams - Fast in-memory queue
If Cassandra Must Be Used for Queue-Like Patterns¶
-- Time-windowed processing batches
CREATE TABLE processing_batches (
batch_hour TIMESTAMP,
worker_id INT, -- Partition by worker to avoid contention
job_id TIMEUUID,
payload TEXT,
PRIMARY KEY ((batch_hour, worker_id), job_id)
) WITH default_time_to_live = 86400; -- TTL instead of delete
-- Each worker owns specific partition
-- Worker 0 reads (batch_hour, worker_id=0)
-- Worker 1 reads (batch_hour, worker_id=1)
-- No contention, no deletes, bounded partitions
-- Status tracking without deletes
CREATE TABLE job_status (
job_id UUID PRIMARY KEY,
status TEXT,
updated_at TIMESTAMP
) WITH default_time_to_live = 604800; -- 7 days
-- Update status instead of delete
UPDATE job_status SET status = 'completed', updated_at = toTimestamp(now())
WHERE job_id = ?;
Secondary Index Abuse (High)¶
The Problem¶
Secondary indexes in Cassandra query all nodes in the cluster, making them expensive for primary access patterns.
-- ANTI-PATTERN: Index as primary access pattern
CREATE TABLE users (
user_id UUID PRIMARY KEY,
email TEXT,
username TEXT,
country TEXT
);
CREATE INDEX ON users (email);
-- 90% of queries use this index
SELECT * FROM users WHERE email = ?;
What Happens¶
| Query Type | Nodes Contacted | Latency |
|---|---|---|
| PRIMARY KEY query | 1-3 nodes (based on RF) | 2-5 ms |
| Secondary index query | ALL nodes | 20-200 ms (scales with cluster size) |
| Cluster Size | Nodes Queried (Secondary Index) |
|---|---|
| 5 nodes | 5 nodes |
| 50 nodes | 50 nodes |
| 500 nodes | 500 nodes |
When Secondary Indexes Are OK¶
-- OK: Low cardinality, occasional queries
CREATE INDEX ON orders (status); -- Few values: pending, shipped, delivered
SELECT * FROM orders WHERE status = 'pending';
-- Acceptable for dashboard/reporting queries
-- OK: Combined with partition key
SELECT * FROM user_orders
WHERE user_id = ? AND status = 'pending';
-- Index only scans within one partition
When to Avoid Secondary Indexes¶
-- BAD: High cardinality
CREATE INDEX ON users (email); -- Millions of unique values
CREATE INDEX ON orders (order_id); -- Every order has unique ID
-- BAD: Primary access pattern
-- If > 10% of queries use the index, create a dedicated table
-- BAD: Large result sets
SELECT * FROM users WHERE country = 'US'; -- Returns millions of rows
Solution: Dedicated Table¶
-- Instead of index on email
CREATE TABLE users_by_email (
email TEXT PRIMARY KEY,
user_id UUID,
username TEXT
);
-- Maintain both tables
BEGIN BATCH
INSERT INTO users (user_id, email, username) VALUES (?, ?, ?);
INSERT INTO users_by_email (email, user_id, username) VALUES (?, ?, ?);
APPLY BATCH;
-- Query by email hits single partition
SELECT user_id FROM users_by_email WHERE email = ?;
Large IN Clauses (High)¶
The Problem¶
Large IN clauses create parallel requests that overwhelm coordinators.
-- ANTI-PATTERN: Hundreds of values
SELECT * FROM products WHERE product_id IN (
'550e8400-e89b-12d3-a456-426614174000',
'550e8400-e89b-12d3-a456-426614174001',
-- ... 500 more UUIDs ...
);
What Happens¶
For an IN clause with 500 values:
| Step | Action |
|---|---|
| 1 | Coordinator receives query |
| 2 | Spawns 500 sub-queries (one per value) |
| 3 | Each sub-query routes to different partition on different node(s) |
| 4 | Coordinator tracks 500 outstanding requests in memory |
| 5 | All 500 must complete before response sent |
| Problem | Impact |
|---|---|
| Coordinator heap pressure | Memory: 500 × (request state + response buffer) |
| Single slow sub-query | Entire query times out |
| Any failure | No partial results returned |
| Latency | = max(all sub-query latencies) |
Solutions¶
Solution 1: Batch in application
def get_products_batch(session, product_ids, batch_size=20):
"""Query in smaller batches."""
results = []
for i in range(0, len(product_ids), batch_size):
batch = product_ids[i:i + batch_size]
rows = session.execute(
"SELECT * FROM products WHERE product_id IN %s",
(tuple(batch),)
)
results.extend(rows)
return results
Solution 2: Parallel async queries
async def get_products_async(session, product_ids):
"""Individual queries, controlled parallelism."""
stmt = session.prepare("SELECT * FROM products WHERE product_id = ?")
# Control concurrency with semaphore
semaphore = asyncio.Semaphore(50)
async def fetch_one(pid):
async with semaphore:
return await session.execute_async(stmt, [pid])
tasks = [fetch_one(pid) for pid in product_ids]
results = await asyncio.gather(*tasks)
return [row for result in results for row in result]
Solution 3: Denormalize if IDs come from another query
-- If the pattern is: Get orders → Get products for those orders
-- Consider storing product info in orders table
CREATE TABLE orders_with_products (
order_id UUID,
product_id UUID,
product_name TEXT, -- Denormalized
product_price DECIMAL, -- Denormalized
quantity INT,
PRIMARY KEY ((order_id), product_id)
);
-- Single query returns order with product details
Guidelines¶
| IN Clause Size | Recommendation |
|---|---|
| 1-20 | Generally safe |
| 20-100 | Monitor coordinator latency |
| 100-500 | Split into batches |
| 500+ | Redesign approach |
Collection Abuse (Medium)¶
The Problem¶
Collections (LIST, SET, MAP) are stored as single cells and read/written atomically.
-- ANTI-PATTERN: Unbounded collection
CREATE TABLE users (
user_id UUID PRIMARY KEY,
name TEXT,
activity_log LIST<TEXT> -- Grows forever
);
-- Every page view appends to list
UPDATE users SET activity_log = activity_log + ['viewed product X']
WHERE user_id = ?;
What Goes Wrong¶
Month 1: 100 activities → Collection size: 10 KB
Month 6: 50,000 activities → Collection size: 5 MB
Year 1: 500,000 activities → Collection size: 50 MB
Problems:
- Entire 50 MB collection read on any access
- Entire 50 MB collection written on any update
- Cannot query/page through collection
- Cannot TTL individual elements
- Memory pressure during serialization
Collection Limits¶
| Limit | Issue |
|---|---|
| ~64KB | Cell size warning in logs |
| ~10MB | Significant performance impact |
| ~100 elements | Recommended maximum for good performance |
Solution: Use a Table Instead¶
-- CORRECT: Separate table for unbounded data
CREATE TABLE user_activity (
user_id UUID,
activity_time TIMESTAMP,
activity TEXT,
PRIMARY KEY ((user_id), activity_time)
) WITH CLUSTERING ORDER BY (activity_time DESC)
AND default_time_to_live = 2592000; -- 30 days
-- Can query specific ranges
SELECT * FROM user_activity
WHERE user_id = ?
LIMIT 100;
-- Can TTL individual rows
-- Can page through results
-- Each row is independent
When Collections Are OK¶
-- OK: Small, bounded collections
CREATE TABLE products (
product_id UUID PRIMARY KEY,
name TEXT,
tags SET<TEXT>, -- 5-20 tags max
attributes MAP<TEXT, TEXT>, -- 10-20 attributes max
image_urls LIST<TEXT> -- 5-10 images max
);
-- Application enforces size limits
-- Collection rarely updated after initial write
Counter Misuse (Medium)¶
The Problem¶
Counters have strict limitations that cause errors when violated.
-- ANTI-PATTERN: Mixing counters with non-counters
CREATE TABLE page_stats (
page_id UUID PRIMARY KEY,
title TEXT, -- Non-counter
view_count COUNTER -- Counter
);
-- ERROR: Cannot mix counter and non-counter columns
-- ANTI-PATTERN: TTL on counters
UPDATE page_views USING TTL 3600
SET views = views + 1 WHERE page_id = ?;
-- ERROR: TTL is not supported on counter mutations
-- ANTI-PATTERN: Setting counter to specific value
UPDATE page_views SET views = 0 WHERE page_id = ?;
-- ERROR: Cannot set counter to specific value
Counter Rules¶
| Rule | Restriction |
|---|---|
| Table contents | Counter tables can ONLY contain primary key columns and counter columns. No other column types allowed. |
| Operations | Counters can ONLY be incremented (views = views + 1) or decremented (views = views - 1). Cannot be set to a specific value. |
| Limitations | No TTL allowed. No lightweight transactions. No batches with non-counter operations. |
| Consistency | Eventual consistency (may show stale values). Not suitable for exact counts. Good for approximate metrics. |
Correct Counter Usage¶
-- Counter table (counters only)
CREATE TABLE page_views (
page_id UUID PRIMARY KEY,
view_count COUNTER,
unique_visitors COUNTER
);
-- Metadata table (separate)
CREATE TABLE page_info (
page_id UUID PRIMARY KEY,
title TEXT,
url TEXT,
created_at TIMESTAMP
);
-- Update counter
UPDATE page_views SET view_count = view_count + 1
WHERE page_id = ?;
-- Batch counter updates (counter-only batch)
BEGIN COUNTER BATCH
UPDATE page_views SET view_count = view_count + 1 WHERE page_id = ?;
UPDATE page_views SET unique_visitors = unique_visitors + 1 WHERE page_id = ?;
APPLY BATCH;
When to Avoid Counters¶
If the application needs: - Exact counts → Use application-level aggregation - Time-windowed counts → Use pre-aggregated tables - TTL on counts → Use regular columns with application increment - Resettable counts → Use regular BIGINT columns
-- Alternative: Manual aggregation
CREATE TABLE daily_page_views (
page_id UUID,
date DATE,
views BIGINT,
PRIMARY KEY ((page_id), date)
);
-- Can set specific values, use TTL
UPDATE daily_page_views
SET views = views + 1 -- Application handles read-modify-write
WHERE page_id = ? AND date = toDate(now());
Over-Normalization (Medium)¶
The Problem¶
Applying relational normalization to Cassandra requires multiple queries and application-level joins.
-- ANTI-PATTERN: Normalized schema
CREATE TABLE users (
user_id UUID PRIMARY KEY,
username TEXT,
email TEXT
);
CREATE TABLE addresses (
address_id UUID PRIMARY KEY,
user_id UUID,
street TEXT,
city TEXT,
country TEXT
);
CREATE TABLE orders (
order_id UUID PRIMARY KEY,
user_id UUID,
total DECIMAL,
created_at TIMESTAMP
);
-- To display order with user and address:
-- Query 1: Get order
-- Query 2: Get user
-- Query 3: Get address
-- Application joins results
Problems¶
- Multiple round-trips increase latency
- Network overhead multiplied
- No transactional guarantees across queries
- Application complexity for joins
- Inconsistent results if data changes between queries
Solution: Denormalize for Query Patterns¶
-- Order display query needs: order + user info + shipping address
CREATE TABLE orders_with_details (
order_id UUID,
user_id UUID,
user_name TEXT, -- Denormalized from users
user_email TEXT, -- Denormalized from users
shipping_street TEXT, -- Denormalized from addresses
shipping_city TEXT,
shipping_country TEXT,
total DECIMAL,
created_at TIMESTAMP,
PRIMARY KEY ((order_id))
);
-- Single query returns everything needed
SELECT * FROM orders_with_details WHERE order_id = ?;
Trade-off Management¶
| Aspect | Trade-off |
|---|---|
| Storage | More disk space (data duplicated) |
| Write complexity | Must update multiple tables on change |
| Consistency | Denormalized data may become stale |
| Aspect | Benefit |
|---|---|
| Read performance | Single query, single partition |
| Latency | Predictable, low latency |
| Scalability | Horizontal scaling works |
Anti-Pattern Detection Checklist¶
Log Analysis¶
# Tombstone warnings
grep -i "tombstone" /var/log/cassandra/system.log
# Large partition warnings
grep "Compacting large partition" /var/log/cassandra/system.log
grep "Writing large partition" /var/log/cassandra/system.log
# Query warnings
grep "ALLOW FILTERING" /var/log/cassandra/debug.log
grep "Slow query" /var/log/cassandra/debug.log
nodetool Diagnostics¶
# Partition sizes
nodetool tablehistograms keyspace.table
# Tombstone metrics
nodetool tablestats keyspace.table | grep -i tombstone
# Query latencies
nodetool tablestats keyspace.table | grep -i latency
Query Tracing¶
TRACING ON;
SELECT * FROM suspicious_table WHERE ...;
-- Analyze trace for:
-- - Tombstones scanned
-- - SSTable count
-- - Node latencies
JMX Metrics¶
# Key metrics to monitor
ReadLatency, WriteLatency - Per-table latencies
TombstoneScannedHistogram - Tombstones per read
EstimatedPartitionSizeHistogram - Partition size distribution
PendingCompactions - Compaction backlog
Summary: Design Checklist¶
Before deploying any Cassandra table:
| Check | Requirement |
|---|---|
| ☐ Partition key | Has high cardinality |
| ☐ Partition size | Bounded (< 100 MB) |
| ☐ Growth pattern | No unbounded growth |
| ☐ Queries | Do not require ALLOW FILTERING |
| ☐ Secondary indexes | Not on high-cardinality columns |
| ☐ Collections | Limited to ~100 elements |
| ☐ DELETEs | Minimized (use TTL instead) |
| ☐ Counters | Properly isolated in counter-only tables |
| ☐ Access patterns | Each pattern has a dedicated table |
| ☐ IN clauses | Limited to ~20 values |
Next Steps¶
- Data Modeling Concepts - Correct design patterns
- Time Bucketing - Partition management
- Performance Tuning - Optimization
- Troubleshooting - Problem diagnosis