nodetool gcstats¶
Displays garbage collection statistics for the Cassandra JVM.
Synopsis¶
nodetool [connection_options] gcstats
Description¶
nodetool gcstats shows cumulative garbage collection metrics including:
- GC invocation counts
- Total GC time
- Time since last GC
Monitoring GC is critical as excessive garbage collection can cause latency spikes and node instability.
JVM Metrics Source
These statistics are retrieved from the JVM's garbage collection MXBeans via JMX (java.lang:type=GarbageCollector), not from Cassandra's internal metrics. The values represent what the JVM reports about its garbage collection activity.
GC Elapsed Time vs Stop-the-World Pauses
The elapsed time values reported by gcstats represent the total time the garbage collector was active, which may not directly correspond to stop-the-world (STW) pause times:
- Concurrent collectors (G1GC, ZGC, Shenandoah) perform much of their work concurrently with application threads. The elapsed time includes both concurrent phases and STW phases.
- Parallel collectors (ParallelGC) are fully STW, so elapsed time equals pause time.
- For accurate STW pause analysis, enable GC logging and use specialized GC analysis tools.
The metrics are still valuable for identifying GC pressure trends, but should not be interpreted as exact application pause durations when using concurrent collectors.
Output Example¶
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms) GC Reclaimed (MB) Collections Direct Memory Bytes
986 123 45678 45 123456 1234 67108864
Output Fields¶
| Field | Description |
|---|---|
| Interval (ms) | Time since gcstats was last called |
| Max GC Elapsed (ms) | Longest single GC pause |
| Total GC Elapsed (ms) | Cumulative GC time |
| Stdev GC Elapsed (ms) | Standard deviation of GC times |
| GC Reclaimed (MB) | Memory reclaimed by GC |
| Collections | Number of GC invocations |
| Direct Memory Bytes | Off-heap memory usage |
Interpreting Results¶
Healthy GC¶
Max GC Elapsed (ms)Total GC Elapsed (ms) Collections
50 5000 1000
- Max GC < 200ms: No significant pauses
- Average GC (Total/Collections) < 50ms: Normal
- GC is working efficiently
GC Pressure Signs¶
Max GC Elapsed (ms)Total GC Elapsed (ms) Collections
2000 500000 50000
GC Problems
Warning signs:
- Max GC > 500ms: Long stop-the-world pauses
- High collection count: GC running frequently
- Average GC increasing over time
Long GC Pauses¶
Max GC Elapsed (ms): 2345
Impact of Long GC
GC pauses > 500ms cause:
- Client timeouts
- Node marked as down by gossip (if > 10s)
- Read/write failures
- Coordinator failovers
When to Use¶
Routine Monitoring¶
# Check GC health periodically
nodetool gcstats
Investigating Latency Issues¶
When experiencing slow queries:
nodetool gcstats
nodetool tpstats
nodetool proxyhistograms
After Heap Tuning¶
Verify GC behavior after JVM changes:
# Before change
nodetool gcstats > gc_before.txt
# Make JVM changes, restart
# After change
nodetool gcstats > gc_after.txt
diff gc_before.txt gc_after.txt
During Load Testing¶
Monitor GC under stress:
watch -n 5 'nodetool gcstats'
GC Analysis Over Time¶
Comparing Intervals¶
# Take samples
nodetool gcstats
sleep 60
nodetool gcstats
The difference shows GC activity in that interval:
| Metric | Sample 1 | Sample 2 | Delta | Interpretation |
|---|---|---|---|---|
| Collections | 1000 | 1050 | 50 | ~1 GC/sec |
| Total GC | 5000 | 7000 | 2000ms | 2 sec GC in 60 sec |
| GC % | - | - | 3.3% | Acceptable |
Calculating GC Overhead¶
GC overhead = (Total GC Elapsed / Node Uptime) × 100
| Overhead | Assessment |
|---|---|
| < 5% | Normal |
| 5-10% | Monitor closely |
| > 10% | Tuning needed |
| > 20% | Critical |
Common GC Issues¶
Frequent Full GC¶
High collection count with large Max GC indicates full GCs:
Causes: - Heap too small - Memory leak - Large allocations
Solutions: - Increase heap size - Check for memory leaks - Review workload patterns
Long GC Pauses¶
Causes: - Large heap with old collector - Promotion failures - Too many live objects
Solutions: - Tune GC settings - Use G1GC or ZGC - Reduce object allocation rate
Memory Not Reclaimed¶
Low GC Reclaimed despite high collections:
Causes: - Most objects are long-lived - Memory leak - Off-heap pressure
Solutions: - Review caching settings - Check for leaks - Monitor off-heap usage
JVM GC Tuning¶
Common Settings¶
# In jvm.options
# G1GC (recommended for modern Cassandra)
-XX:+UseG1GC
-XX:MaxGCPauseMillis=500
-XX:G1HeapRegionSize=16m
# Heap sizes
-Xms8G
-Xmx8G
GC Logging¶
Enable GC logs for detailed analysis:
# jvm.options
-Xlog:gc*:file=/var/log/cassandra/gc.log:time,uptime:filecount=10,filesize=10M
Analyzing GC Logs¶
Use tools like: - GCViewer - GCEasy - VisualVM
Correlation with Other Metrics¶
GC and Thread Pools¶
nodetool gcstats
nodetool tpstats
High GC often correlates with: - Blocked threads - Pending tasks - Dropped messages
GC and Latency¶
nodetool gcstats
nodetool proxyhistograms
GC pauses directly impact p99 latencies.
GC and Heap¶
nodetool gcstats
nodetool info | grep "Heap Memory"
Near-full heap triggers more frequent GC.
Examples¶
Basic Check¶
nodetool gcstats
Monitor Continuously¶
watch -n 10 'nodetool gcstats'
Compare Across Nodes¶
for node in node1 node2 node3; do
echo "=== $node ==="
ssh "$node" "nodetool gcstats"
done
Alert on High GC¶
#!/bin/bash
max_gc=$(nodetool gcstats | tail -1 | awk '{print $2}')
if [ "$max_gc" -gt 1000 ]; then
echo "WARNING: Max GC pause ${max_gc}ms exceeds threshold"
fi
Best Practices¶
GC Monitoring Guidelines
- Baseline normal GC - Know what healthy looks like
- Monitor continuously - Include in alerting
- Enable GC logging - Essential for troubleshooting
- Correlate with latency - GC often explains spikes
- Right-size heap - Not too big, not too small
- Use modern GC - G1GC or better for large heaps
Related Commands¶
| Command | Relationship |
|---|---|
| info | Heap memory usage |
| tpstats | Thread pool status |
| proxyhistograms | Latency histograms |
| tablestats | Table metrics |