Common Cassandra Errors¶

Reference guide for frequently encountered Cassandra errors, their causes, and solutions.

Error Categories¶

Timeout Errors¶

Errors occurring when operations exceed configured time limits.

Error	Default Timeout	Common Cause
ReadTimeoutException	5000ms	Slow disk, large partitions, tombstones
WriteTimeoutException	2000ms	Overloaded nodes, disk issues
RangeSliceTimeoutException	10000ms	Large range scans
TruncateException	60000ms	Large table truncation

Availability Errors¶

Errors related to replica availability.

Error	Cause	Solution
`UnavailableException`	Insufficient replicas alive	Check node status, reduce consistency level
`NoHostAvailableException`	Cannot reach any coordinator	Check network, verify cluster is running
`WriteFailureException`	Replica write failed	Check failing node logs
`ReadFailureException`	Replica read failed	Check failing node logs

Data Errors¶

Errors related to data or schema.

Error	Cause	Solution
`InvalidQueryException`	CQL syntax or semantic error	Fix query syntax
`InvalidRequestException`	Invalid request parameters	Check request parameters
`TombstoneOverwhelmingException`	Too many tombstones	Fix data model, run compaction
`SyntaxException`	CQL syntax error	Check CQL syntax

Authentication/Authorization Errors¶

Security-related errors.

Error	Cause	Solution
`AuthenticationException`	Invalid credentials	Verify username/password
`UnauthorizedException`	Insufficient permissions	Grant required permissions

Quick Diagnosis¶

Check Cluster Health¶

# Node status
nodetool status

# Thread pools (look for blocked/dropped)
nodetool tpstats

# Compaction status
nodetool compactionstats

Check Logs¶

# Recent errors
grep -i "error\|exception" /var/log/cassandra/system.log | tail -50

# Specific error
grep "TimeoutException" /var/log/cassandra/system.log | tail -20

Check Table Health¶

# Table statistics
nodetool tablestats my_keyspace.my_table

# Key metrics to check:
# - SSTable count (high = needs compaction)
# - Average tombstones per read (high = data model issue)
# - Partition size (large = data model issue)

Error Resolution Workflow¶

Error Occurs
    │
    ▼
┌─────────────────────────────────────┐
│  1. Identify error type             │
│     - Check client exception        │
│     - Check Cassandra logs          │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│  2. Check cluster health            │
│     - nodetool status               │
│     - nodetool tpstats              │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│  3. Check specific component        │
│     - Table stats for data issues   │
│     - Compaction for performance    │
│     - Logs for stack traces         │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│  4. Apply fix                       │
│     - See specific error page       │
│     - Follow playbook if available  │
└─────────────────────────────────────┘

Detailed Error Guides¶

Timeout Errors¶

ReadTimeoutException - Read operations timing out
WriteTimeoutException - Write operations timing out

Playbooks for Complex Issues¶

For issues requiring multi-step resolution, see the Troubleshooting Playbooks.

Prevention¶

Monitoring¶

Set up alerts for early warning:

Metric	Warning Threshold	Critical Threshold
Read latency p99	> 100ms	> 500ms
Write latency p99	> 50ms	> 200ms
Pending compactions	> 20	> 100
Dropped messages	> 0	> 10/min
SSTable count	> 20 per table	> 50 per table

Best Practices¶

Monitor proactively - Don't wait for errors
Run repairs regularly - Weekly for most workloads
Keep compaction healthy - Monitor pending tasks
Size partitions correctly - Aim for < 100MB per partition
Avoid tombstone accumulation - Use TTLs and proper deletion patterns

Troubleshooting Overview - General troubleshooting framework
Diagnosis Guide - Systematic diagnosis procedures
Log Analysis - Understanding Cassandra logs
Playbooks - Step-by-step resolution guides

Common Cassandra Errors¶

Error Categories¶

Timeout Errors¶

Availability Errors¶

Data Errors¶

Authentication/Authorization Errors¶

Quick Diagnosis¶

Check Cluster Health¶

Check Logs¶

Check Table Health¶

Error Resolution Workflow¶

Detailed Error Guides¶

Timeout Errors¶

Playbooks for Complex Issues¶

Prevention¶

Monitoring¶

Best Practices¶

Related Documentation¶